Hub-and-Spoke Network Design Considering Congestion and Flow-Based Cost Function

Khosravi, Shahrzad; Bozorgi, Ali; Zahedi-Seresht, Mazyar

doi:10.3390/app14156416

Open AccessArticle

Hub-and-Spoke Network Design Considering Congestion and Flow-Based Cost Function

by

Shahrzad Khosravi

^1,*,

Ali Bozorgi

² and

Mazyar Zahedi-Seresht

¹

Department of Quantitative Studies, University Canada West, Vancouver, BC V6Z 0E5, Canada

²

Discipline of Business Analytics, University of Sydney Business School, Sydney, NSW 2008, Australia

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(15), 6416; https://doi.org/10.3390/app14156416

Submission received: 25 April 2024 / Revised: 10 July 2024 / Accepted: 11 July 2024 / Published: 23 July 2024

Download Versions Notes

Abstract

:

This paper presents a model for a “hub-and-spoke network design considering congestion and flow-based cost function”. The number of hubs and spokes is unknown, and the objective is to minimize the cost (including the transportation cost, lost demand, and facility setup cost). In the post-pandemic era, it is expected to have government-imposed restrictions on the congestion of airports, as a measure of health and safety. Unlike the current literature which considers a monetary penalty for congestion, we consider congestion as an externally imposed factor, which should be modeled as a constraint. We take a gravity-based modeling approach to obtain the desirability of a facility and calculate the demand matrix of the network. To solve the model, a Benders decomposition approach is proposed. Without the Benders decomposition approach, only instances with up to ten nodes were solved within a reasonable time, but with the Benders decomposition approach, instances with up to forty nodes were solved. A heuristic algorithm is developed to have a mechanism for dealing with larger instances. A set of experiments are conducted using data from the Turkish Network dataset to study various aspects of the proposed formulation and different parameters’ effects on the performance of the model.

Keywords:

transportation; hub-and-spoke network design; congestion; flow-based cost function; COVID-19

1. Introduction

Hub-and-spoke networks are a significant type of graph topology used in various industries, including transportation, telecommunication, air travel, postal delivery, and freight. In these networks, a few nodes act as hubs—often representing major airports or stations—while the remaining nodes are known as spokes. Hub networks originate in transportation and telecommunication systems in which several origin/destination nodes send and receive some flow. While the nodes may be connected to send/receive directly, joining a hub network to route the flow via “hub” nodes is preferred. This is to reduce the incurred cost as the consolidated flow among the hub network benefits from the economy of scale. The transportation network is a common example of an industry with such problems.

It is preferred to include the congestion of a hub facility in hub location problems due to the effect of pandemics on travel behaviors. There are several reasons for such behavior, among which we can refer to the following:

For passenger transportation, overcrowded facilities may not be preferred in the post-pandemic era due to health and safety concerns and regulations, as well as due to the higher chance of delays and cancelations in congested facilities.

For cargo transportation companies, firms are eager to use their resources in a balanced manner. An overcrowded facility (even to transport cargo) may increase associated costs and complexities and make the working environment less desirable.

For facility management at a hub, the congestion negatively impacts scheduling and planning, causing several delays and cancelations. Therefore, accounting for congestion while designing the network is extremely important [1].

There are three ways to address congestion in the hub-and-spoke model: (1) imposing classical capacity constraints on hub nodes for limiting the amount of flow entering the hubs, (2) incorporating costs of congestion effects explicitly into the objective function using a convex cost function that increases exponentially as more flows go through the hubs, and (3) accounting for congestion in hub nodes using queuing theory and calculating the waiting time of the flow in terms of congestion. Based on this, we incorporate the congestion of hub facilities into our mathematical model using the first method.

This research considers the location problem faced by a transportation company to locate new facilities, considering the perceived utility of their customers, as well as the congestion in the hub facilities. In this work, the inter-hub cost coefficient may not be a fixed value for every route and would be a function of transported passengers on each route. In other words, we consider the economy of scale coefficient to be dependent on the flow of the associated link. This is to avoid assigning large discount factors to small inter-hub flows. We incorporate flow-based inter-hub cost coefficients into the network design formulation. That is, the perceived utility of different facilities by customers is considered as a set of given parameters. It should be noted that while the model incorporates the customers’ perceived utility, the sole decision-maker of the model is the transportation company, not the customers. As a result, the customers from nodes with no facility may either travel to the nearby nodes with established facilities or would leave our service (which would be treated as “lost demand”). This travel cost (from no-facility nodes to the nearby node with facility) is not considered by the transportation company and thus the model, but the lost demand is included. It is also assumed that the potential demand values among different pairs of origin–destination (O-D) are known a priori, which is given by marketing studies or strategic studies. Based on this, the demand between O and D pairs is elastic: the demand is determined based on the location of spoke facilities and the perceived utility of the established facilities, as also discussed by Redondo et al. [2].

We denote our model as a hub-and-spoke location/transportation problem (HSLTP). The model is originally developed from the point of view of a new airline/transportation company entering the market. It is obvious that by small modifications, this model can be used for existing companies for expansion purposes or for adapting their facilities with changes in the demand/cost (any exogenous) factors. It can also be used for evaluating the status of a company, by running the optimization model and comparing the optimum network with their current network.

The rest of this paper is organized as follows. In Section 2, we review the related literature to this work. Section 3 introduces the problem at hand. In this section, we propose the mathematical formulation of the problem, including linearization techniques (Appendix A) that are applied to the developed MINLP model, and transform it into the MILP model. Section 4 contains a description of an exact solution approach based on a modified Benders decomposition, followed by a heuristic algorithm developed for larger instances of the problem in Section 5. Numerical experiments are presented in Section 6. The Section 7 provides the analysis of the results and conclusion of the work, as well as addresses some future research lines.

2. Literature Review

The problem of finding the optimum location for hub facilities in a given network is well-studied in the literature and these are generally named hub location problems (HLPs), in which it is assumed that the hub network is a complete graph [3]. Hub location problems receive great attention in the literature, and many aspects of this set of problems have been studied. There are several review articles in this field for interested readers [4,5,6,7,8].

There is another version of HLPs, in which the assumption of having a complete network is relaxed. Thus, the model should identify the links to establish in the hub network and determine where to locate the hub facilities. These types of problems are known in the literature as Hub Arc Location Problems (HALPs) [9].

Several transportation modes in the literature of hub location/transportation research are studied, such as sea-based/maritime [10,11,12,13,14], railroad [15,16], air [17,18,19,20], multi-modal [21,22,23,24]), etc.

2.1. Hub-and-Spoke Transportation Network Design

On passenger transportation hub/location issues, Gelareh and Nickel [25] formulated a hub-and-spoke location problem for the transportation network. In their proposed model, the authors consider a network of public transportation, such as bus and subway systems. They explain that a passenger might use some links to arrive at the hub network, traverse some links on the hub graph, and then exit the hub graph, either for his/her destination or to a spoke facility, which would be his/her final destination. It is obvious that the passenger determines the origin and the destination, and then the model solves the rest of the problem. In other words, once the model receives the origin/destination data, it solves a transportation problem to find the optimum route for the passenger to pass to arrive at the destination, considering cost function minimization. They develop an advanced version of the Benders decomposition, using some accelerated techniques such as the one explained by Magnanti and Wong [26], and also develop a heuristic algorithm to further improve the solution algorithm. The capacity of a spoke link can also be limited. Zheng et al. [27] propose a network in which the capacity of a spoke link is a function of the incentive provided to transporters.

Regarding the inter-hub cost coefficient, the literature contains two lines of research: (i) articles that consider a cost coefficient if a passenger/flow is transported via the hub link, and (ii) articles that do not consider such coefficient. The researchers in the first group argue that the major point of scaling a facility to a hub facility is to consolidate the transportation flow and benefit from the economy of scale. Despite this, several articles fail to represent that concept. Campbell [28] discusses this issue, surveys the literature, and prepares a list of several articles that consider cost coefficients (as a result of economies of scale) for a hub link, which, in fact, has less flow than several other non-hub links in the same network. This reveals the fact that such an assumption/formulation is not always valid. Campbell et al. [29] explain that considering a constant, unique cost factor for hub links may not be an appropriate approach for this topic, based on the research articles in the literature. They provide several examples of research that considers the cost efficiency factor for a hub link, which has less passenger flow than some non-hub links.

2.2. Utility-Based Demand Calculation for Hub Location Problems

In this work, we formulate the problem for an airline transportation system, which also envisions the perceived utility of users to decide upon the spoke locations. In the literature, utility is also referred to as “attractiveness”. In this line, Eiselt and Marianov [30] developed a new model for a new airline entering the market. They formulated the problem as a p-hub location, considering a demand function, based on the competitive nature of the airline industry. They consider attractiveness factors to represent the demand, which is based on the cost and travel time of a route, in addition to the attractiveness of the hub as perceived by each passenger. The authors utilize heuristic concentration to solve the problem in two phases: (1) reducing the number of hub location candidates to a much smaller set that they state to be local optima, and (2) solving the two-vertex substitution problem and applying it to the Australian Postal dataset.

The proximity approach has the subtle assumption that customers choose a facility merely based on spatial factors: they only choose the closest facility. Such an assumption leads to an “All or Nothing” result: the customers either choose a facility and the whole demand uses the chosen facility, or they do not choose a facility (and the demand is lost). This is not a true representation of the real world. While distance is an important factor for customers as to whether or not to choose a facility, there are several other factors in this decision-making procedure, which support the use of the utility approach:

If a facility is located in a slightly different location, it would lose the whole demand and will not be chosen by customers.

There might be some other nearby facilities with higher attraction for customers, and the customers prefer to drive a longer distance there.

It is not ensured that all customers measure the distance from a facility exactly and precisely, which leads them to choose different facilities, which are not necessarily the closest facility to the customer.

As discussed by Francis et al. [31] and Francis et al. [32], the type C error in aggregation may happen, which is as follows: the facilities are not a spot on a map; instead, they are locations with surface area. Thus, a facility may be closest to one customer who resides in a segment of that area while for another customer that case is not true.

In addition to all the analytical reasons behind choosing a facility, shopping and touristic trips may be other aspects for choosing a facility which rely on other purposes for the trip [33]. Therefore, customers may prefer to drive a longer distance to reach a facility with a different arrangement of shopping and recreational activities.

In this research, we not only find locations for hub nodes but also select the spoke nodes (domestic facilities) to design the hub-and-spoke network together. Based on an earlier discussion, the attractiveness (utility) of a facility should be considered while selecting the spoke nodes.

In all of the above-mentioned articles, the hub locations are chosen from a set of given nodes. It is important to find out the origin and method to obtain the initial set of nodes. In a recent study, Alibeyg et al. [34] propose a profit maximization hub network problem that decides upon not only the hub locations but also the spoke facilities, with the goal of profit maximization. In another study, Khosravi and Akbari [35] propose a hub-and-spoke location model, using the gravity function to choose a set of potential facilities based on some attractiveness criteria—Drezner [36], Drezner and Drezner [37,38]. The model then decides upon a subset of nodes to be upgraded to be hub facilities. This model, on the other hand, does not handle neither congestion of the hub nodes nor the variable cost function of the hub network.

2.3. Contribution of This Work

Airports’ congestion is expected to be restricted by government policies in the post-pandemic era, to address health and safety concerns as a consequence of the outbreak of COVID-19 in 2019–2020. Highly congested ports and airports are not favorable, due to the high rate of delayed flights reported in congested airports [39,40]. Congestion is handled in two different ways: either by limiting the flow, as studied by Ebery et al. [41] and Ernst and Krishnamoorthy [42], or by considering the monetary cost of congestion, which would impact the objective function. This was first modeled by [43], who proposed a nonlinear cost penalty term for the congestion of a hub facility. De Camargo et al. [44] expand this model to study the effect of considering congestion in a multiple allocation hub location problem in the airline industry. Najy and Diabat [45] incorporate flow-dependent economies of scale and the effects of congestion, utilizing piece-wise linearization for the nonlinear components.

In this work, we aim to bridge this gap and propose a mathematical model for multiple allocation hub-and-spoke location/transportation problems, considering the congestion balance of facilities, as well as flow-based hub transportation cost functions. To the best of our knowledge, no research work considers the utility of facilities in a transportation network, considering the hub-and-spoke location/transportation problem, which considers transportation problems as well as hub congestion.

3. Mathematical Formulation of the Model

As discussed earlier, we aim to develop a hub-and-spoke location model, which includes determining the location of domestic facilities (as spoke) and upgrading a number of those as the hub locations. We also determine the inter-hub links to be established, as well as the routing decisions among different O-Ds. We present the primary model in Section 3.2, adding the congestion balance constraints in Section 3.3, and the complete model (primary model + congestion balance + flow-based cost function) in Section 3.4.

In the literature, a set of MIP models are proposed by Nickel et al. [46] for urban transportation. Another set of models is presented by Gelareh et al. [47] and Gelareh and Nickel [24]. In this work, we use the modeling approach of the latter to develop the base model. There are several related research and formulations on this topic, but Gelareh and Nickel [24] were able to demonstrate the superiority of the developed model in terms of having less constraints and variables, as well as less computational time, on a benchmark dataset.

3.1. Formal Definition and Modeling Assumptions

We can formally define an HSLTP as follows. Let G = (N, A) be a connected directed graph, where N = {1, 2, …, n} represents the set of nodes and A represents the set of arcs.

Parameters: For each

i \in N, D_{i} \geq 0

represents the cost of having a domestic facility, and

F_{i} \geq 0

denotes the cost of upgrading to a hub facility. In our model, we use

P_{i j}

to demonstrate the potential demand of flow from i to j and

C_{i j}

to represent the unit transportation cost of passing hub link i-j. The cost of establishing a hub link of k-l is shown as

I_{k l}

, while

U_{m n i j}

is the utility value of transporting the demand from m to n, through i to j, which is zero (0) if there is no (or not close enough) facility in i and j. Then,

0 < α < 1

is a discount factor of using the hub link. The assigned capital for facility establishment is also expressed as “Budget”, while B represents the unit lost demand cost.

Decision variables:

s_{i}

is a binary decision variable and takes 1 if we decide to locate a domestic facility at point “i”, and similar to that is

h_{k}

: a binary decision variable that takes 1 if we decide to upgrade a hub facility at point “k”. The binary decision variable

y_{k l}

states whether we decide to establish a link between hub nodes k and l (1) or not (0). The variables

x_{i j k l}

= 1,

i \neq j, k \neq l

, if the optimal path from i to j passes through the hubs k and l, but 0 otherwise;

a_{i j k}

= 1,

i \neq j, k \neq i, j

if in the optimal path from i to j, and node i (non-hub) is directly connected to hub k, but 0 otherwise.

b_{i j k}

= 1,

i \neq j, k \neq i, j

if in the optimal path from i to j, and node k(hub) is directly connected to node j (j is non-hub), but 0 otherwise.

e_{i j}

= 1,

i \neq j

if the optimal path from i to j passes through the (i, j) link in which either one or none of i and j is a non-hub. Also,

w_{i j}

denotes the flow from i to j (this is obtained by solving the attractiveness function of facilities). Just as a note, i, j, k, l, m, n, o, d can be used interchangeably.

3.2. Mathematical Model

Let us denote the utility of a path passing through nodes i and j; demand originating from m to n is formulated as

U_{m n i j}

. The proportion of demand originating from m to n choosing i and j as their O-D can be calculated as

\frac{U_{m n i j}}{\sum_{o, d} U_{m n o d} s_{o} s_{d}}

(1)

As occurs in the real world, not all the demand can be satisfied, especially once we decide not to locate a facility for every demand point. As a result, the sum of all utilities is not necessarily summed up to 1, leaving some “lost demand”, which is calculated as

\sum_{m, n} P_{m n} - \sum_{i, j} \sum_{m, n} P_{m n} \frac{U_{m n i j} \cdot s_{i} s_{j}}{\sum_{o, d} U_{m n o d} s_{o} s_{d}}

(2)

Let us denote the following model as HSLTP-01:

m i n \sum_{i} \sum_{j \neq i} \sum_{k} \sum_{l \neq k} {α w}_{i j} C_{k l} x_{i j k l} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{i k} a_{i j k} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{k j} b_{i j k} + \sum_{i} \sum_{j \neq i} w_{i j} C_{i j} e_{i j} + \sum_{k} F_{k} h_{k} + \sum_{k} D_{k} s_{k} + \sum_{k} \sum_{l > k} I_{k l} y_{k l} + B * (\sum_{m, n} P_{m n} - \sum_{m, n} w_{m n})

(3)

Subject to

y_{k l} \leq h_{k}, y_{k l} \leq h_{l} k, l \neq k

(4)

h_{k} \leq s_{k} \forall k

(5)

\sum_{l \neq i} x_{i j i l} + \sum_{l \neq i, j} a_{i j l} + e_{i j} = z, i, j \neq i

(6)

\sum_{l \neq i} x_{i j l j} + \sum_{l \neq i, j} b_{i j l} + e_{i j} = z, i, j \neq i

(7)

\sum_{i, j} w_{i j} \leq z * M, j, k \neq j

(8)

\sum_{l \neq k, i} x_{i j k l} + b_{i j k} = \sum_{l \neq k, j} x_{i j l k} + a_{i j k}, i, j \neq i, k \neq i, j

(9)

x_{i j k l} \leq y_{k l}, i, j \neq i, k, l

(10)

\sum_{l \neq k} x_{k j k l} \leq h_{k}, j, k \neq j

(11)

\sum_{k \neq l} x_{i l k l} \leq h_{l}, i, l \neq i

(12)

a_{i j k} \leq {1 - h}_{i}, i, j \neq i, k \neq i, j

(13)

b_{i j l} \leq {1 - h}_{j}, i, j \neq i, k \neq i, j

(14)

a_{i j k} + \sum_{l \neq j, k} x_{i j k l} \leq h_{k}, i, j \neq i, k \neq i, j

(15)

b_{i j k} + \sum_{l \neq k, i} x_{i j k l} \leq h_{k}, i, j \neq i, k \neq i, j

(16)

e_{i j} + 2 x_{i j i j} + \sum_{l \neq j, i} x_{i j i l} + \sum_{l \neq i, j} x_{i j l j} \leq h_{i} + h_{j} i, j \neq i

(17)

\sum_{k} F_{k} h_{k} + \sum_{k} D_{k} s_{k} + \sum_{k} \sum_{l > k} I_{k l} y_{k l} \leq B u d g e t

(18)

w_{i j} = \sum_{m, n} P_{m n} \frac{U_{m n i j} \cdot s_{i} s_{j}}{\sum_{o, d} U_{m n o d} s_{o} s_{d}}, i, j \neq i

(19)

x_{i j k l}, a_{i j k}, b_{i j k}, e_{i j}, h_{i}, s_{i}, y_{k l}, z \in (0, 1) & w_{i j} \in R^{+}

(20)

In the above mathematical programming formulation, the objective function consists of hub setup costs, spoke setup costs, hub link establishment costs, transportation costs, and lost demand costs, as shown in (3). By constraint (4), we ensure that a hub link may be established if only both endpoints have hub facilities, while constraint (5) guarantees that only (established) spoke nodes can be upgraded to hub facilities. In other words, to establish a hub facility, it should first be chosen as a spoke, and then upgraded to be a hub facility. Constraints (6)–(9) are the flow balance constraints. In constraint (10), we ensure that the flow can traverse on the established hub links only. It should be considered that only the flow coming out of (into) a hub facility may use an outgoing (incoming) hub link. This is modeled in constraints (11) and (12). Constraints (13) and (14) ensure that any outgoing (incoming) flow from (to) a hub facility that is not traversed via a hub link should not be destined to (originated from) a hub facility. The flow between i and j would not pass through any other node on its path unless the node is a hub node. Constraints (15) and (16) ensure such happens. In addition, the path between i and j has three (3) possible combinations: either both i and j are hub facilities, one of the two is a hub facility, or none of them are a hub facility, as shown in constraint (17). The budget constraint is formulated in (18), and the utility is used to calculate the demand values, as shown in (19). Finally, constraint (20) is a domain constraint.

3.3. Model with Balanced Congestion

Among the earliest works on hub congestion, Elhedhli and Hu [43] define the congestion for a single hub location problem as follows:

\sum_{i} \sum_{j > i} \sum_{m} {w_{i j} x}_{i j k m} = g_{k} \forall k \in K

(21)

where

g_{k}

is the congestion of the hub facility k. Later, de Camargo et al. [44] extended the definition for the congestion of a multi-allocation hub location, as the sum of incoming and outgoing flow from that hub:

\sum_{i} \sum_{j} \sum_{l} w_{i j k l} + \sum_{i} \sum_{j} \sum_{l \neq k} w_{i j l k} = g_{k} \forall k \in K

(22)

de Camargo et al. [44] define the congestion function as

τ_{k} (g_{k}) = e \cdot g_{k}^{b}

, where e and b are positive constants. To have an increasing function b ≥ 1 as considered by Elhedhli and Hu [43].

In the mentioned articles, as well as other published hub location research, the aim is to minimize the congestion. This can be argued as a counterproductive objective since while a highly congested facility is not desirable, the flow/congestion of a hub facility is important to make a profit.

We also believe that it is not an easy task to put monetary value (penalty) on the congestion of a facility and then of course find the correct value of this cost factor. In this regard, congestion balance across the network seems to be an alternative substitution to the minimization of congestion. This ensures not having highly congested hubs on one side and low congested hubs (underutilized) on the other side. Thus, we calculate the deviation between the most and least congested facilities and model it as a constraint to distribute the flow in a fairly balanced way through the network. We believe such a model is a more realistic representation of a decision-makers’ concern, as the congestion might not be seen necessarily as a cost or even a negative factor (which needs to be directly minimized). In contrast, we believe it needs to be well managed and distributed among different facilities in the network as evenly as possible.

To mathematically formulate this concept, we first need to calculate the congestion of a hub (as reflected in (23)). We then find the hubs with maximum and minimum flow/congestion (24) and then limit the difference between the two (25). The following terminology is used for the congestion of a hub facility:

$g_{k}$ : the congestion of a hub facility k.
$g^{M a x}$ : maximum congestion among all the hub facilities across the network.
$g^{M i n}$ : minimum congestion among all the hub facilities across the network.
∇: maximum allowed dispersion among hub congestions across the network.

The final mathematical model (denoted as HSLPT-02) is the extension of model HSLTP-01 by additional constraints (23)–(26).

\sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j k l} + \sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j l k} + \sum_{i \neq k} \sum_{j \neq i} w_{i j} a_{i j k} + \sum_{i \neq k} \sum_{j \neq i} w_{i j} b_{i j k} + \sum_{i \neq k} 2 w_{i k} e_{i k} + \sum_{i \neq k} 2 w_{k i} e_{k i} + \sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j k l} \cdot h_{i} \cdot h_{j} + \sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j l k} \cdot h_{i} \cdot h_{j} = 2 \cdot g_{k} \forall k \in K

(23)

g^{M a x} = \begin{matrix} \max \\ k \end{matrix} (g_{k}), g^{M i n} = \begin{matrix} \min \\ k \end{matrix} (g_{k}), \forall k \in K

(24)

g^{M a x} \leq \nabla \cdot g^{M i n}

(25)

g_{k} \in R^{+}

(26)

3.4. Model with Flow-Based Economies of Scale Inter-Hub Transportation Cost

In the literature, there are not many research articles that model the inter-hub transportation cost function as flow-based. Among the initial works, O’Kelly and Bryan [48] address this issue and model the cost function based on the traversed flow. This model is later extended by Ricardo Saraiva de Camargo et al. [49], who also develop a tailored Benders decomposition algorithm to solve the developed model.

This paper also considers a flow-based cost function for hub links, as a more precise representation of the real-world case than a constant cost function. For this purpose, we consider a piecewise linear function, in which the higher flow volume is associated with a higher discount on transportation costs. This is due to the modular capacity of different transporters. Suppose a type of aircraft with capacity “C1” and another type with capacity “C2” > “C1”. Until the number of passengers does not reach “C1”, the type 1 aircraft is the best option, but once the number of passengers exceeds “b”, the type 1 aircraft can be substituted with type 2, with a higher capacity. It is not unrealistic to assume that the per-person cost of aircraft type 1 is higher than the per-person cost for aircraft type 2. Therefore, we consider a piecewise linear function to represent the dependency of the cost function on the flow of each link. This is formulated in (28). Therefore, the new model HSLPT-03 is as follows:

m i n \sum_{i} \sum_{j \neq i} \sum_{k} \sum_{l \neq k} {α_{k l} w}_{i j} C_{k l} x_{i j k l} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{i k} a_{i j k} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{k j} b_{i j k} + \sum_{i} \sum_{j \neq i} w_{i j} C_{i j} e_{i j} + \sum_{k} (F_{k} h_{k}) + \sum_{k} D_{k} s_{k} + \sum_{k} \sum_{l > k} I_{k l} y_{k l} + B * (\sum_{m, n} P_{m n} - \sum_{i, j} w_{i j})

(27)

Subject to (4)–(20)

α_{k l} = \{\begin{matrix} α_{k l}^{1} \sum_{i} \sum_{j \neq i} w_{i j} x_{i j k l} < b \\ α_{k l}^{2} b < \sum_{i} \sum_{j \neq i} w_{i j} x_{i j k l} \end{matrix}

(28)

In the objective function, previously used, the fixed value of

α

is changed to

α_{k l}

to represent the flow-based cost function coefficient.

4. Solution Approach

The proposed formulation includes nonlinear terms. Hub-and-spoke location transportation network design is an NP-hard problem, thus the proposed model in this paper. We propose some techniques to linearize the formulation, as well as tightening rules to have a tighter formulation. These techniques are presented in full in Appendix A and Appendix B.

Once the linearization techniques (Appendix A) are applied and the tightening rules are implemented, the model could be solved for very small instances only (up to nine nodes) using commercial software packages (GAMS), but the solver may not accommodate larger-sized problems. Thus, a Benders decomposition (BD)-based method is developed to tackle the problem for larger instances. In addition, the methodology is further enhanced using e-optimality and warm-start techniques.

A common approach to tackle large-scale optimization problems is to decompose the problem into smaller problems that are typically easier to solve so that iteratively solving these smaller problems leads to finding the solution to the original, large-scale problem. Benders decomposition (BD) is a common approach in the literature of MIP models that have been extensively used for large-scale MIPs. In general, the integer variables are modeled as the master problem, and the remaining LP model would be the sub-problem. By solving the LP sub-problem to optimality, more information about the problem is obtained and added to the master problem, which needs to be resolved, using the new information from the sub-problem. This procedure repeats until no further improvement in the solution is observed.

As reported by Magnanti and Wong [26] and Naoum-Sawaya and Elhedhli [50], the convergence of the BD is not easily accessed, especially as the size of the problem instance increases. Since the initial development of the BD by Benders in 1963, it has been successfully applied in several research articles, and several improvements have been developed to overcome the shortcomings of the original method. Interested readers can refer to Rahmaniani et al. [51] for further information.

In our case, we implement two accelerating techniques as described in the following sections to boost the algorithm and obtain the solution in a reasonable amount of time for medium-sized instances.

4.1. Master Problem

The master problem would determine the domestic facility locations and choose the hub nodes among the domestic facility locations for an upgrade. The mathematical formulation is as follows:

M i n i m i z e \sum_{k} F_{k} h_{k} + \sum_{k} D_{k} s_{k} + \sum_{k} \sum_{l > k} I_{k l} y_{k l} + B (\sum_{m, n} P_{m n} - \sum_{i, j} w_{i j}) + ϑ

(29)

Subject to

y_{k l} \leq h_{k}, y_{k l} \leq h_{l} k, l \neq k

(30)

h_{k} \leq s_{k} \forall k

(31)

\sum_{k} F_{k} h_{k} + \sum_{k} D_{k} s_{k} + \sum_{k} \sum_{l > k} I_{k l} y_{k l} \leq B u d g e t

(32)

w_{i j} = \sum_{m, n} P_{m n} \frac{U_{m n i j} \cdot s_{i} s_{j}}{\sum_{o, d} U_{m n o d} s_{o} s_{d}}, i, j \neq i

(33)

\sum_{k} h_{k} \geq 1

(34)

f_{k l}^{i j} \leq y_{k l} k \neq l \forall i, j, k, l

(35)

f_{k l}^{s m} \leq h_{m} k \neq l \forall i, j, k, l

(36)

f_{k l}^{i j} \leq h_{i} k \neq l \forall i, j, k, l

(37)

\sum_{l} f_{i l}^{i j} - \sum_{l} f_{l i}^{i j} = h_{i} \cdot h_{j} s \neq m, s \neq l l \forall i, j

(38)

\sum_{k} f_{k j}^{i j} - \sum_{k} f_{j k}^{i j} = h_{i} \cdot h_{j} i \neq j, k \neq j, l \forall i, j

(39)

\sum_{k} f_{k l}^{i j} = \sum_{k} f_{l k}^{i j} i \neq j, k \neq l, l \neq j, l \neq i \forall i, j, l

(40)

y_{k l}, h_{k}, s_{k} \in \{0, 1\}

(41)

f_{k l}^{i j} \in R^{+} \forall i, j, k, l

(42)

In the above formulation,

f_{k l}^{i j}

represents the flow from node “i” to node “j” which enters the network at node “k” and leaves the network at node “l” (since “k” and “l” are the established domestic facilities and there is no facility in either “i” or “j”; thus, the passengers would travel to “k” as their start point and would leave at “l”). Initial investigation reveals that using only the original master problem, as the Benders master problem, may result in the non-connected graph. To overcome this issue, a set of constraints (34)–(42) are added to form the Benders master problem, as above [52].

4.2. Sub-Problem

Once the master problem is solved, we need to solve the sub-problem, generate valid inequalities (the integer cuts), add those to the master problem, and solve the master problem again, until the stopping criteria happen (either convergence or a certain number of cycles). In the master problem, the location of the hub-and-spoke facilities is determined, as well as the hub links in the hub network graph. Once the location of the spoke facilities is known, we can calculate the demand of each node, using the utility relations explained earlier. Therefore, for the sub-problem, the values of

w_{i j}

are fixed and known and are parameters, rather than variables, as in the integrated model.

The sub problem essentially deals with the transportation problem, which also decides upon the flow of hub links. We define the cost function of hub links as a function of the traversed flow. In simple words, we need to know the flow to put the correct value of (α) (which is needed to solve the sub problem), while we first need to solve the sub problem to determine the traversed flow. To overcome this interdependency, we develop a heuristic Algorithm 1 that can be described as follows:

Algorithm 1 Sub-Problem

Step 1 : Put the initial value of α_{k l} = α_{k l}^{2}

Step 2: Run the sub-problem
Step 3: For every pair of (k, l)
if

\sum_{i} \sum_{j \neq i} w_{i j} x_{i j k l} < b

then

α_{k l} \leftarrow α_{k l}^{1}

We can then write the sub-problem, assuming fixed values for

α_{k l}

, as follows:

\min ϑ = \sum_{i} \sum_{j \neq i} \sum_{k} \sum_{l \neq k} {α w}_{i j} C_{k l} x_{i j k l} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{i k} a_{i j k} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{k j} b_{i j k} + \sum_{i} \sum_{j \neq i} w_{i j} C_{i j} e_{i j}

(43)

Subject to

\sum_{l \neq i} x_{i j i l} + \sum_{l \neq i, j} a_{i j l} + e_{i j} = z i, j \neq i

(44)

\sum_{l \neq i} x_{i j l j} + \sum_{l \neq i, j} b_{i j l} + e_{i j} = z, i, j \neq i

(45)

\sum_{l \neq k, i} x_{i j k l} + b_{i j k} = \sum_{l \neq k, j} x_{i j l k} + a_{i j k}, i, j \neq i, k \neq i, j

(46)

x_{i j k l} \leq y_{k l}, i, j \neq i, k, l

(47)

\sum_{l \neq k} x_{k j k l} \leq h_{k}, j, k \neq j

(48)

\sum_{k \neq l} x_{i l k l} \leq h_{l}, i, l \neq i

(49)

a_{i j k} \leq {1 - h}_{i}, i, j \neq i, k \neq i, j

(50)

b_{i j l} \leq {1 - h}_{j}, i, j \neq i, k \neq i, j

(51)

a_{i j k} + \sum_{l \neq j, k} x_{i j k l} \leq h_{k}, i, j \neq i, k \neq i, j

(52)

b_{i j k} + \sum_{l \neq k, i} x_{i j k l} \leq h_{k}, i, j \neq i, k \neq i, j

(53)

e_{i j} + 2 x_{i j i j} + \sum_{l \neq j, i} x_{i j i l} + \sum_{l \neq i, j} x_{i j l j} \leq h_{i} + h_{j} i, j \neq i

(54)

\sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j k l} + \sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j l k} + \sum_{i \neq k} \sum_{j \neq i} w_{i j} a_{i j k} + \sum_{i \neq k} \sum_{j \neq i} w_{i j} b_{i j k} + \sum_{i \neq k} 2 w_{i k} e_{i k} + \sum_{i \neq k} 2 w_{k i} e_{k i} + \sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j k l} \cdot h_{i} \cdot h_{j} + \sum_{i} \sum_{j \neq i} \sum_{l \neq k} w_{i j} x_{i j l k} h_{i} h_{j} = 2 g_{k} \forall k \in K

(55)

g^{M a x} = \begin{matrix} \max \\ k \end{matrix} (g_{k}), g^{M i n} = \begin{matrix} \min \\ k \end{matrix} (g_{k}), \forall k \in K

(56)

g^{M a x} \leq \nabla \cdot g^{M i n}

(57)

x_{i j k l}, a_{i j k}, b_{i j k}, e_{i j} \in (0, 1)

g^{M a x}, g_{k} \in R^{+}

4.3. Dual of the Sub-Problem

The dual of the above-formulated sub-problem is as follows:

M a x ϑ = - (\sum_{i, j \neq i} (u_{i j} + v_{i j}) + \sum_{i, j \neq i} \sum_{k \neq i, j} (s_{i j k} + w_{i j k}) h_{k} + \sum_{j, k \neq j} p_{j k} h_{k} + \sum_{i, l \neq i} q_{i l} h_{l} + \sum_{i, j \neq i} d_{i j} (h_{i} + h_{j}) + \sum_{i, j \neq i} \sum_{k \neq i, j} (a_{i j k} ({1 - h}_{i}) + b_{i j k} ({1 - h}_{j})) - \sum_{i, j \neq i} \sum_{k, l \neq k} o_{i j k l} y_{k l})

(58)

Subject to

{{2 w}_{i j} m}_{i} h_{i} + {{2 w}_{i j} m}_{j} h_{j} + u_{i j} + v_{i j} + p_{j i} + q_{i j} + o_{i j i j} + 2 d_{i j} \geq - α w_{i j} C_{i j}, \forall i, j \neq i,

(59)

{w_{i j} m}_{j} h_{j} + {w_{i j} m}_{k} h_{k} + v_{i j} + r_{i j k} + w_{i j k} + q_{i j} + o_{i j k j} + d_{i j} \geq - α w_{i j} C_{k j} \forall i, j \neq i, k \neq i, j,

(60)

{w_{i j} m}_{i} h_{i} + {w_{i j} m}_{l} h_{l} + u_{i j} + p_{j i} + d_{i j} + s_{i j l} - r_{i j l} + o_{i j i l} \geq - α w_{i j} C_{i l}, \forall i, j \neq i, l \neq i, j

(61)

{{w_{i j} m}_{k} h_{k} + {w_{i j} m}_{l} h_{l} + r}_{i j k} - r_{i j l} + s_{i j l} + w_{i j k} + o_{i j k l} \geq - α w_{i j} C_{k l}, \forall i, j \neq i, k \neq i, j

(62)

{{w_{i j} m}_{k} h_{k} + u}_{i j} - r_{i j k} + s_{i j k} + a_{i j k} \geq - w_{i j} C_{i k} \forall i, j \neq i, k \neq i, j

(63)

{{w_{i j} m}_{k} h_{k} + v}_{i j} + r_{i j k} + w_{i j k} + b_{i j k} \geq - w_{i j} C_{k j} \forall i, j \neq i, k \neq i, j

(64)

2 {w_{i j} m}_{i} h_{i} + 2 {w_{i j} m}_{j} h_{j} + u_{i j} + v_{i j} + d_{i j} \geq - w_{i j} C_{i j} \forall i, j \neq i

(65)

{- 2 m}_{k} - \nabla {m a}_{k} + {m i}_{k} \geq 0 \forall k

(66)

\sum_{k, h (k) = 1} {m a}_{k} - \sum_{k, h (k) = 1} {m i}_{k} \geq 0

(67)

d_{i j}, e_{i j}, p_{i j}, q_{i j}, a_{i j k}, b_{i j k}, s_{i j k}, w_{i j k}, o_{i j k l} \in R^{+}

(68)

u_{i j}, v_{i j}, r_{i j k}, m_{k}, {m a}_{k}, {m i}_{k} U R S

(69)

By solving this dual problem, we can generate the cut for the master problem, which is as follows:

- (\sum_{i, j \neq i} (u_{i j} + v_{i j}) + \sum_{i, j \neq i} \sum_{k \neq i, j} (s_{i j k} + w_{i j k}) h_{k} + \sum_{j, k \neq j} p_{j k} h_{k} + \sum_{i, l \neq i} q_{i l} h_{l} + \sum_{i, j \neq i} d_{i j} (h_{i} + h_{j}) + \sum_{i, j \neq i} \sum_{k \neq i, j} (a_{i j k} ({1 - h}_{i}) + b_{i j k} ({1 - h}_{j})) - \sum_{i, j \neq i} \sum_{k, l \neq k} o_{i j k l} y_{k l}) \leq ϑ

(70)

4.4. Using E-Optimality Technique

As noted by Magnanti and Wong [25] and Zarandi [53] in the BD, it is not needed for the MP and SP to be solved to optimality, but finding a relatively good solution is also fine, and even the convergence of the algorithm is proven in this case. This is also discussed in early iterations for the algorithm when there is not much information about the solution, and solving the MP and SP to optimality is not only a time-consuming task but would return not much information. Based on this, we define a variable for the optimality gap of MP and SP, which can be tuned. We also consider periodically solving MP and SP to optimality, not to fall far from the final solution. Our numerical results confirm the efficiency of applying such a technique to speed up the BD algorithm. The following is a pseudo-code of the Algorithm 2:

Algorithm 2 E-Optimality Technique

Determine n1 and n2, to be the periodic number to optimally solve SP and MP, determine the initial solution gap for SP and MP as G1 and G2,
for (i = 1:n) do

if d i v (\frac{i}{n_{1}}) = 0

{G 1}_{i} \leftarrow 0;

Run MP;
else

{G 1}_{i} \leftarrow \frac{G 1}{i}

Run MP;
end if

if d i v (\frac{i}{n_{2}}) = 0

{G 2}_{i} \leftarrow 0;

Run SP;
else

{G 2}_{i} \leftarrow \frac{G 2}{i}

Run SP;
end if
end for

4.5. Using Warm-Start Technique

It is also discussed in the literature that starting with a quality solution would help the BD to converge faster and bring a higher performance. We implement a simple heuristic algorithm to generate a set of initial solutions, based on the characteristics of the problem, in the hope of helping the algorithm converge and be able to solve larger instances of the problem.

For this purpose, the following heuristic Algorithm 3 is developed and applied:

Algorithm 3 Warm-Start Technique

Procedure to find initial values of spoke locations
for (i = 1:NS) do

if d (s) \geq Q 1 (d)

s \leftarrow 1;

else if d (s) \leq Q 3 (d)

s \leftarrow 0;

else
P(s = 0) = P(s = 1) = 0.5;
end if
end for

This can be summarized as follows:

s = \{\begin{matrix} 1 : i f : \{d e m a n d (i) > \frac{2}{3} d e m a n d\} a n d \{d i s t a n c e (i) < \frac{1}{3} a v e r a g e d i s t a n c e\} \\ 0 : i f : \{d e m a n d (i) < \frac{1}{3} d e m a n d\} a n d \{d i s t a n c e (i) > \frac{2}{3} a v e r a g e d i s t a n c e\} \\ o t h e r w i s e P (s = 1) = P (s = 0) = 0.5 \end{matrix}

We also consider the budget constraint in this phase, not to locate excessive facilities beyond the budget limitations.

5. Heuristic Algorithm

The problem we formulated in this research is NP-hard, even after applying all linearization and tightening techniques. In our initial experiments, we observed that the exact algorithm is not able to solve large instances of the problem, which motivates us to develop a heuristic algorithm to address this issue.

The BD (with the improvement techniques) is also able to find the exact solution for up to 40 node instances only. To solve larger instances of the problem in a reasonable time, we develop a heuristic algorithm, using some structural properties of the problem.

5.1. Fixing Hub-and-Spoke Locations

In this one, the heuristic algorithm chooses the nodes with and without established facilities. This is achieved using the following simple, intuitive, yet (as our experiments show) efficient, rules. These rules are distance-based and demand-based. We consider the nodes with relatively high demand to have spoke facilities and those with relatively low demand with no facilities. The nodes in between may have or not have a facility, with a probability (lets say 50%). The same can be argued for distance: nodes that are relatively closer to the other nodes have an established facility, while the nodes that are relatively far from other nodes do not.

In combination, it can be stated that a spoke facility is established on either nodes with relatively high demand or on nodes that are close to several other nodes (to be favorable to transfer flow). These are shown as

s_{i} = \{\begin{matrix} 1 : i f : \{d e m a n d (i) > \frac{2}{3} d e m a n d\} a n d \{d i s t a n c e (i) < \frac{1}{3} a v e r a g e d i s t a n c e\} \\ 0 : i f : \{d e m a n d (i) < \frac{1}{3} d e m a n d\} a n d \{d i s t a n c e (i) > \frac{2}{3} a v e r a g e d i s t a n c e\} \\ o t h e r w i s e P (s = 1) = P (s = 0) = 0.5 \end{matrix} \forall i

We then need to check the feasibility of budget constraint and make sure the selected spoke facilities satisfy the constraint. If the constraint is not satisfied, another set is chosen (since we randomly chose some spoke facilities). Once the spoke locations are determined, we update the budget constraint and choose random locations for hub facilities, which satisfy the updated budget constraint.

5.2. Establishing Hub Links

Once the spoke-and-hub facility nodes are determined, the model establishes the hub links by randomly selecting hub pairs and putting a link between those. The budget constraint is also updated after hub-and-spoke location determination, and the randomly selected hub links should satisfy the updated budget constraint. It is also checked to have a connected graph, while a not fully connected graph is necessary.

5.3. Solving the Flow Routing Problem

Once the hub-and-spoke network is constructed, the problem reduces to a flow routing problem. We now describe the developed heuristic algorithm. This heuristic algorithm is developed with insights from the mechanics of BD in the literature, e.g., as denoted by Zverovich et al. [54] and Linderoth and Wright [55]; the neighborhood of a high-quality solution is more likely to be of similar quality. This follows the procedure in Step 3. In the following algorithm (Table 1), when we state the best solution(s), we point to the best solutions regarding the objective function.

In the heuristic algorithm, we apply solution Diversification in Steps 1, 3, 5, and 8 by implementing 1-opt/2-opt algorithms. In Steps 2, 4, 6, intensification of the solution pool is implemented by choosing only the elite solution from a pool of several solutions.

In Step 3, ξ is a tuning parameter for the model. We arbitrarily choose ξ = 80%, but it can be further modified, with higher values of ξ resulting in tighter selection of variables, while lower values of ξ end in a more generous approach to locate facilities (as less common solutions are needed as selection criteria).

6. Experiments and Numerical Results

We ran numerical experiments to illustrate the performance, as well as different aspects of the proposed formulations. Such analysis would reveal the effect of different parameters on the final solution, bring managerial insights from the mathematical formulation, and also help to achieve a better understanding on the tradeoff of the decisions involved in such decision procedure.

The experiments were ran on a DELL Laptop, with an Intel Core i5-4210U CPU at 1.70 GHz and 16 GB of RAM. All formulations were coded in C# and solved using the CPLEX 12.6.3 library. In all experiments, the maximum computing time was set to 86,400 s (one day).

We used the Turkish network dataset. As it did not have the cost of the spoke location, we assumed it to be

D_{k} = ω F_{k}

, where

ω = 0.3

. We started the experiments using constant inter-hub transportation cost function and further extended it to include flow-based cost function in the following experiments. We also needed to use the perceived utility, which is not in the dataset. To generate the utility values, we implemented the following rules by calculating the attractiveness of a set of O-D:

If the distance between the starting point (m) and the origin facility (i) is more than the sphere of influence of (i) {which we arbitrarily consider to be 500}, or the distance between the final destination (n) and the destination facility (j) is more than the sphere of influence of (j), the perceived utility of

U_{m n i j}

is zero (0). Otherwise, the perceived utility is calculated as

U_{m n i j} = \{\begin{matrix} 10,000, i f m = i a n d n = j \\ R a n d [400,600], (m = i a n d n \neq j) o r (m \neq i a n d n = j) \\ R a n d [100,200], m \neq i a n d n \neq j \end{matrix}

Considering triangular inequality, it is obvious that if the node “m” has a facility established, it is not reasonable to transfer the demand to another original facility, such as “i”. The same is true for “n” and any destination facility “j”.

We also consider three levels of alpha for the HSLPT base model, i.e., α = 0.2, 0.5, 0.8. The cost of losing a demand (also mentioned as lost demand cost) is designed in four levels of 0.05, 0.2, 0.5, and 0.8 to cover a wide range of possible configurations. The MIP gap is set to be <5%, unless otherwise stated. We were able to solve networks of up to 9 nodes within this optimality gap, with the exception of two instances: one for the congestion model and one for the flow-based inter-hub cost function. For networks with n ≥ 10 nodes, the exact solver could not close the 5% gap within the given time limit. For example, for n = 10 nodes, the LP gap was 29%, and for n = 12 nodes, the LP gap was 98.8%. Therefore, we decided not to report those experiments.

6.1. Base Model Experiments

We start our experiments by using the base model, which does not include neither congestion nor the variable cost function on hub links. Running this model would give us fundamental ideas regarding the performance of the model. This is also important to study the effect of considering the location of spoke facilities not as given parameters but as decision variables that we bring into our model in this article. For this purpose, we also consider cases in which the spoke locations are given a priori, as is common in hub/location-related articles. We randomly generate the set of spoke locations, run this 25 times, and compare the average value of the total cost with the “integrated model”, in which the model decides upon both hub-and-spoke locations.

The results are reported in Table 2. We have four levels of the lost demand cost in column 1 and the number of nodes in column 2. We report the number of spokes, hub, and established hub links in columns “s”, “h”, and “hub links” respectively. The total cost is reported in “total cost” column, followed by the run time, as in the column labeled “run time”. The column “objective improvement” states the relative improvement compared to the case where the spoke locations are given, as is common in hub/location-related articles.

From the experimental results in Table 2, we observe some insights: the better the utilization of hub links (lower value for alpha), the more hubs located in the network. It is also seen that as the lost demand cost increases, the model in general tends to locate more spoke facilities to satisfy a higher portion of the demand (to avoid incurred lost demand costs). At the two ends of the spectrum, with very high and very low values for lost demand, the solution time is relatively shorter than the mid-values of lost demand: for high lost demand costs, the model tends to locate more facilities to collect more demand, while for the lowest lost demand costs, the model is generous in losing demand (due to the fairly low penalty); for the mid-values, there is a trade-off on whether to satisfy or to lose the demand, which impacts the number of spokes and the structure of the network and consequently requires extra computational effort to make such decision.

In general, as the hub utilization preference increases (lower values of α), the total cost decreases (improves). In the case of 6, 7, and 8 nodes, since there is only one hub facility and no hub links (with lost demand cost = 0.8), the objective functions remain the same. For larger networks that have hub links, an intuitive outcome is usually observed.

The run times reported in Table 2 also indicate the extensive computational effort required to solve this model for larger instances. Even though we set our maximum run time to 86,400 s, we were not able to solve larger instances.

6.2. Benders Decomposition

As discussed earlier, the exact solution approach using the CPLEX solver is not able to solve mid-size or larger instances of the problem. To overcome this, we use the Benders decomposition algorithm, enhanced with a couple of improvement techniques, namely warm-start and the e-optimality approach. We run the base model as in the previous case using the Benders-based approach. Using this, we are able to solve larger instances of the problem. The results are reported in Table 3. As shown in this table, the BD approach is able to solve larger instances of the problem. For n = 10, the Benders-based solution is extremely quicker than the CPLEX solver while obtaining the same results, since the Benders approach is an exact approach. Aligned with the results of small instances as in Table 2, as the value of alpha increases, the total cost increases, which in some cases can lead to less hub facilities/hub links due to the less desirability with the higher hub transportation cost. Regarding the Modified Benders Algorithm (with preprocessing enhancements), the run time is improved compared to the classic Benders approach, which enables our algorithm to solve instances of 45 nodes in less than two hours. Since the initial exact algorithm does not solve instances larger than 10 nodes, we use the Benders solution to study the effect of congestion.

6.3. Effect of Congestion

As we incrementally develop our model, we argue that the congestion is an important issue, which might be favored by senior managers for strategic-level decision-making. We add the congestion balance constraints to form the (HSLPT-02) model. Here, we investigate how that affects the performance and solution obtained for this model. We study the case with a congestion factor = 2 (the highest-congested hub facility should not be more than double as congested as the least-congested facility). The results are summarized in Table 4.

Comparing the results of Table 3 and Table 4 reveals that the congestion causes an increase in the objective function: from a mathematical point of view, it is an additional constraint whose addition does not expand the feasible solution space. As a result, by adding the congestion constraints, the feasible solution space either remains the same or decreases, which may result in losing favorite points and degrade the objective function (increase in our case). From the comparison of both tables, we can also learn about the effect of congestion on the solution: considering the congestion in the model results in having the same number of hubs (spokes), less, and/or a greater number of hubs (spokes). To further investigate the phenomena, we analyze the structure of the solution network and the differences with/without congestions, which gives valuable insights into the mechanism of the solution. In some cases, there is a highly congested hub, compared to the least-congested hub in the network (more than the pre-determined congestion ratio). Once the congestion constraint is introduced to the model, the model prefers to establish the network with a lower number of hub facilities, by simply removing the least-congested hub facility, and either spread the flow among the other hubs or lose it. In some other cases with similar situations, the model adds an additional hub facility to spread the flow among different hub facilities, not to violate the congestion constraint. This occurs in instances with high lost demand costs, in which the model prefers not to simply lose the demand, and also with low values of (

α

), which further justifies adding another hub facility (in a 35-node network with lost demand = 0.5,

α

= 0.2 →

α

= 0.5).

A common trend in all the tables and analyses is that a lower number of spoke facilities results in lost demand, which is not favored by the model for expensive lost demand costs. A higher number of spoke facilities, on the other hand, results in absorbing the budget and leaving a small number of hub locations, which is not favored for the model since inter-hub transportation is well preferred.

6.4. Effect of Flow-Based Cost Function on the Hub Links

In this work, we also consider a flow-based transportation discount factor, in contrast to most of the research in the body of the literature that assumes a fixed value for that. In this section, we run numerical experiments with a flow-based cost function to study how this would make our formulation more realistic. The results are reported in Table 5. The results show that the flow-based cost function results in equal or better total costs compared to the case with α = 0.5, which is intuitive. In the same way, we can observe that the results of variable cost coefficient are no better than α = 0.2, which has the lowest total cost. It can also be learnt from the results that considering a flow-based alpha can affect the number of hub facilities/hub links, e.g., for the network with 40 nodes, the number of hub links increases from 70 (α = 0.5) to 72 (for flow-based α from 0.2 to 0.5). This indicates the higher preference of the model to use more hub facilities/hub links to further benefit from the economies of scale discount. It also depicts that the run time is longer for the flow-based alpha, since the model performs several different tradeoffs, within a larger solution space, and also a set of additional variables are introduced to the model

α_{k l}^{i}, f_{k l}^{L}, Z_{k l}^{L}

.

6.5. Heuristic Algorithm Results

To find an optimal or near-optimal solution in a reasonable amount of time, we developed a heuristic algorithm. In this section, we analyze the efficiency and performance of the algorithm. We run the heuristic algorithm for small instances, for which we have the optimal solution. Thus, we are able to compare the quality of the results of the heuristic algorithm with the ones of exact method.

The results in Table 6 show that the heuristic algorithm is able to find high-quality solutions (and find the optimal solution for small instances). The runtime of the heuristic method is also reported. For instances with six nodes, the exact algorithm reaches optimality in a relatively shorter run time, but as the number of nodes increase, the heuristic algorithm lags with the exact solution; for exhaustive networks of only nine nodes, the heuristic method outperforms the exact method by far, compared the runtime of other the two. The quality of the obtained solutions from the heuristic algorithm is confirmed by the experiments, as shown in Table 6, and while the runtime is significantly improved, we run larger instances using the heuristic algorithm to find good enough solutions. To achieve a robust solution and comparison with the heuristic algorithm, we ran the algorithm 25 times on the base model and we report the average values of run time and cost function in Table 6.

7. Conclusions

In this paper, we present a hub-and-spoke location/transportation problem and several alternative formulations of the problem with additional considerations, such as perceived utility function, hub congestion, and variable hub link cost function. We propose a new approach to include congestion and add congestion balancing constraints as opposed to congestion minimization.

To formulate the problem, we develop a nonlinear hub-and-spoke location model. We apply linearization techniques to linearize the model and propose some tightening rules to further skim the model and produce a tighter formulation to aid the solution procedure. As in many hub-and-spoke problems, the model is not solvable to optimality for larger instances (it is NP hard), but we were able to run small sample sizes, either to optimality or with less than a 5% gap. We then develop an enhanced Benders algorithm to solve larger instances of the problem and successfully solve instances up to 40 nodes. To tackle the larger instances of the problem, we develop a heuristic algorithm that we run and check the efficiency of. The results demonstrate the following:

Considering the congestion in the model would significantly increase the total cost objective but would comply with post-pandemic regulations on limited congestion in each facility. It would also increase the computational time for the problem but would increase customer satisfaction when traveling through less-congested facilities.
Using the variable value of α would result in a more realistic solution.
The smaller the values of α (hub link transportation cost coefficient), the larger the number of hub facilities would be. This, in addition, would result in a longer solution time but better objective function value.
Modeling and solving the integrated model, in which both spoke-and-hub locations should be determined, results in a better solution, compared to the case where the number and/or location of the spoke facilities are given. This requires more computational effort and time.

Managerial insights for high-level decision-making problems, such as location/transportation decisions, are categorized as strategic-level decisions. From the numerical experiments, we understand that considering a flow-based inter-hub transportation discount not only brings a more robust solution but the obtained solution is closer to real-world costs. It is recommended to choose the spoke locations, together with hub locations, to obtain higher-quality solutions (comparing local versus global optimal).

Further research can be extended in the following directions:

In this work, we try to evenly distribute the congestion among different hub locations. We also determine the passing flow of each node, based on some utility functions. This can be extended in two directions:
○
To model the congestion as a separate objective function and then solve the multi-objective model.
○
To model the effect of congestion as a part of the utility function, which would affect the utility of facilities and may change the favorite hub locations for passengers to pass through.
To get closer to the real-world decision-making procedure, it is recommended to include a time dimension into the formulation. It is obvious that hub location decisions are not planned for short-term utilization but are a strategic decision with long-term effects. During this time period, several initial parameters (such as demand, which is only an estimation at the time of decision-making on hub-and-spoke locations) would change which would impact the “current optimal” or “near-optimal” solution. Considering this, a multi-period, or any other time-included, formulation seems to be a better fit for such strategic mid-level decision-making.

In this work, as in other research projects, we have experienced limitations. The first limitation is not having the exact customer demand before and after locating the facilities. We have incorporated a utility-based model to capture such demand, but it is a multi-factorial phenomenon which may not be fully captured by any single model, including the one we have presented here. In addition, the dynamics of such demand may change in the presence of a competitor company offering similar services. This can be incorporated into the model. Computational capacity is another limitation of this research project.

Author Contributions

Conceptualization, S.K. and A.B.; methodology, S.K. and M.Z.-S.; software, A.B. and S.K.; validation, S.K., A.B. and M.Z.-S.; formal analysis, S.K., A.B. and M.Z.-S.; investigation, S.K. and A.B.; resources, S.K. and M.Z.-S.; data curation, S.K., A.B. and M.Z.-S.; writing—original draft preparation, A.B. and S.K.; writing—review and editing, S.K., A.B. and M.Z.-S.; visualization, S.K., A.B. and M.Z.-S.; supervision, S.K., A.B. and M.Z.-S.; project administration, S.K.; funding acquisition, S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Shahrzad Khosravi.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Linearization of Nonlinear Constraints

In the proposed models, there are nonlinear constraints. The nonlinearity of the model is inevitable but adds to the complexity of the solution procedure and makes the solution less reliable. To overcome such issues, we apply linearization techniques to linearize the nonlinear constraints.

Nonlinear constraints of the utility function

The flow transported from node “i” to node “j” is

w_{i j} = \sum_{m, n} P_{m n} \frac{U_{m n i j} \cdot s_{i} s_{j}}{\sum_{o, d} U_{m n o d} s_{o} s_{d}}, i, j \neq i

(A1)

which is a nonlinear expression. To linearize, we implement a set of substitutions and additional constraints to give us an equivalent formulation in a linear form. We define the probability of choosing a path i-j for demand from m to n, such as

p r o b (m, n, i, j) = \frac{U_{m n i j} \cdot s_{i} s_{j}}{\sum_{o, d} U_{m n o d} s_{o} s_{d}}

(A2)

Thus, we have the demand of m to n going through i and j:

P_{m n} p r o b (m, n, i, j) = d e m (m, n, i, j)

(A3)

where

d e m (m, n, i, j)

is the final, calculated demand from “m” to “n” which uses “i” as the origin node and “j” as the destination on the network. Based on this, we have

w_{i j} = \sum_{m, n} d e m (m, n, i, j), i, j \neq i

(A4)

This is a nonlinear constraint, which adds to the complexity of the model and is not easy to solve. To address this issue, we implement mathematical techniques to linearize the first set of constraints, as also used by Khosravi and Akbari Jokar (2017) [35]:

p r o b (m, n, i, j) \leq s_{i} \forall i, j, m, n

(A5)

p r o b (m, n, i, j) \leq s_{j} \forall i, j, m, n

(A6)

Constraints (A5) and (A6) ensure that

p r o b (m, n, i, j)

cannot take any value if there is not any spoke established on node i or node j.

\sum_{i, j} p r o b (m, n, i, j) + p r o b (m, n, a, b) = 1 \forall m, n

(A7)

The constraint (A7) which can be somehow considered as the “flow constraint” states that the initial demand from starting point “m” to final destination “n” would be either transported via some O-D pair of (i, j) or would be lost (going to dummy points “a” and “b”).

p r o b (m, n, i, j) \leq \frac{u_{m n i j}}{u_{m n k l}} \cdot p r o b (m, n, k, l) + 1 - s_{k} \cdot s_{l} \forall i, j, m, n, k, l

(A8)

The constraint (A8) essentially expresses that the transported flow via O-D (i, j) compared to the transported flow via O-D (k, l) is proportional to their utilities. This is proven by Khosravi and Akbari Jokar (2017) [35]. The nonlinear term in (A8) is linearized, as shown below:

s_{i} \cdot s_{j} = s_{i j} \forall i, j

(A9)

s_{i} + s_{j} - 1 \leq s_{i j} \forall i, j

(A10)

0.5 s_{i} + 0.5 s_{j} \geq s_{i j} \forall i, j

(A11)

Using this equivalent set of constraints, the formulation comes into a mixed-integer linear programming (MILP) model. The drawback is the large number of constraints produced by (A9)–(A11).

b.: Linearization of the multiplication of two variables

In the objective function, both

w_{i j}

and

x_{i j k l}

are decision variables; thus, the term

w_{i j} x_{i j k l}

is the multiplication of two variables and is not linear. As initially defined in the problem definition, we have

x_{i j k l} \in (0, 1)

, which is a continuous variable, but in the absence of the congestion constraint in the optimal solution, we would have

x_{i j k l} \in {0, 1}

since if there is an optimal path from an origin point i to destination point j, traversing hub link k-l, the whole flow would choose that path. Therefore, we can treat

x_{i j k l}

as a binary variable. We then benefit from the characteristic of the binary variables and can linearize this multiplication as follows, by introducing a new variable

q_{i j k l}

such as

q_{i j k l} = w_{i j} x_{i j k l}

(A12)

q_{i j k l} - M (1 - x_{i j k l}) \leq w_{i j} \leq {u q x}_{i j k l} + M (1 - x_{i j k l})

(A13)

q_{i j k l} \leq M \cdot x_{i j k l}

(A14)

The same can be applied for

w_{i j} a_{i j k}

and

w_{i j} b_{i j k}

.

In constraint (23), we have some nonlinear terms, such as

w_{i j} x_{i j k l} h_{i} h_{j}

(A15)

As discussed earlier, and since both h(i) and h(j) are binary variables, we can apply the same technique of linearization, but since we have four variables multiplied, we need to apply this technique in three stages to linearize this term.

We introduce a new binary variable

h_{i j}

, such as

h_{i} h_{j} = h_{i j} \forall i, j

(A16)

h_{i} + h_{j} - 1 \leq h_{i j} \forall i, j

(A17)

0.5 h_{i} + 0.5 h_{j} \geq h_{i j} \forall i, j

(A18)

h_{i}, h_{j}, h_{i j} \in \{0, 1\}

(A19)

c.: Nonlinear constraints of the congestion balance

The constraints (24) and (25) dealing with hub congestion are formulated as follows:

g^{M a x} = \max_{k} (g_{k}), g^{M i n} = \min_{k} (g_{k}) \forall k \in K

g^{M a x} \leq \nabla \cdot g^{M i n} \forall k \in K

They can be linearized as

g^{M a x} \geq g_{k}, \forall k \in K

(A20)

g^{M a x} \leq \nabla \cdot g_{k} \forall k \in K

(A21)

In this formulation,

g^{M a x}

is the congestion of the most crowded (congested) hub, and if

g^{M a x} \leq \nabla \cdot g_{k} \forall k \in K

, we can argue that it satisfies

g^{M a x} \leq \nabla \cdot g^{M i n}

.

It can be perceived that since there is no upper bound for

g^{M a x}

in

g^{M a x} \geq g_{k}, \forall k \in K

it may take any large value, but the next constraint (A21) enables the

g^{M a x}

to be as low as possible to ensure it is just the congestion of the most congested hub node.

d.: Economy of scale parameter linearization

As the flow-based hub link transportation cost function brings nonlinear terms into our objective function, we need to apply linearization techniques to address that and linearize the terms. In the original definition (27) and (28), we have

m i n = \sum_{i} \sum_{j \neq i} \sum_{k} \sum_{l \neq k} {α_{k l} w}_{i j} C_{k l} x_{i j k l} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{i k} a_{i j k} + \sum_{i} \sum_{j \neq i} \sum_{k \neq i, j} w_{i j} C_{k j} b_{i j k} + \sum_{i} \sum_{j \neq i} w_{i j} C_{i j} e_{i j} + \sum_{k} (F_{k} h_{k}) + \sum_{k} D_{k} s_{k} + \sum_{k} \sum_{l > k} I_{k l} y_{k l} + B * (\sum_{m, n} P_{m n} - \sum_{i, j} w_{i j})

α_{k l} = \{\begin{matrix} α_{k l}^{1} \sum_{i} \sum_{j \neq i} w_{i j} x_{i j k l} < b \\ α_{k l}^{2} b < \sum_{i} \sum_{j \neq i} w_{i j} x_{i j k l} \end{matrix}

Since

α_{k l}

has a piece-wise linear function, as defined in (28), we can linearize it by applying the following procedure:

{{u x}_{i j k l} = w}_{i j} x_{i j k l}

(A22)

and linearize it as earlier in Appendix A. Therefore, the only nonlinear term in the objective function is

\sum_{i} \sum_{j \neq i} \sum_{k} \sum_{l \neq k} {α_{k l} w}_{i j} C_{k l} x_{i j k l}

(A23)

Which is re-written, using (A12):

\sum_{i} \sum_{j \neq i} \sum_{k} \sum_{l \neq k} {α_{k l} C}_{k l} q_{i j k l}

(A24)

This can be linearized using the following constraints:

\sum_{i} \sum_{j \neq i} \sum_{k} \sum_{l \neq k} \sum_{L = 1}^{2} α_{k l}^{L} C_{k l} f_{k l}^{L}

(A25)

\sum_{L = 1}^{2} f_{k l}^{L} = \sum_{i} \sum_{j \neq i} q_{i j k l}

(A26)

f_{k l}^{L} \leq Z_{k l}^{L} \cdot M

(A27)

\sum_{L = 1}^{2} Z_{k l}^{L} = h_{k l}

(A28)

\sum_{i} \sum_{j \neq i} q_{i j k l} \leq b Z_{k l}^{1} + M Z_{k l}^{2}

(A29)

\sum_{i} \sum_{j \neq i} q_{i j k l} \geq b Z_{k l}^{2}

(A30)

Appendix B. Tightening the Formulation

We use several linearization techniques to convert the MINLP model presented in Section 3 to an MILP model. The proposed model is still a large-scale problem and is not easy to tackle by increasing the number of nodes in the network. In this section, we apply some preprocessing rules and add valid inequalities to achieve a tighter formulation in order to improve the proposed MILP and help the solution procedure. The idea of implementing preprocessing techniques and tightening rules was also used in the literature by Boland et al. [56] and Marín [57].

Preprocessing

The following rules are applied as preprocessing techniques:

It is not optimal for all

y_{k l}

variables to take values. These variables can take value (1) if and only if both end nodes are hub nodes. As a result, there should be some demands that in the optimal solution traverse the hub link k-l. This should economically be optimal; in other words, the cost of transportation from origin node (i) to first hub (k), transportation on the hub link

y_{k l}

, and the transportation cost from hub (l) to destination (j) outperforms the cost of direct (or any other combination on the path) transportation. This can be shown mathematically as

∄ a, b : d_{a k} + α \cdot d_{k l} + d_{l b} < \min_{m, n} \{d_{a m} + d_{m b}, d_{a m} + α \cdot d_{m n} + d_{n b}, d_{a n} + d_{n b}\} \Rightarrow y_{k l} = 0

(A31)

An argument can be constructed around the satisfied demand of any (i, j) node pairs (

w_{i j}

) as follows:

The total satisfied demand, by comparing the lost demand cost with the transportation cost is as follows:

B \sum_{i, j} w_{i j} \leq \sum_{i, j} w_{i j} \min_{k, l} \{d_{i k} + α d_{k l} + d_{l j}\} + \underset{k, l}{m i n} ({c s}_{k} + {c s}_{l}) + \underset{k, l}{m i n} (h_{k}) + \underset{k, l}{m i n} (y_{k l})

(A32)

The above inequality states that up to a certain level of satisfied/transported demand

(\sum_{i, j} w_{i j})

, the model decides to lose all the demand, rather than satisfy it. [It is also intuitive: for few demand nodes/demand volumes, it is not economically feasible to establish a hub/spoke facility or any link between those]. Therefore, this relation would give the model a minimum value of total satisfied demand, which can be added as an integer cut to the problem:

\{\begin{matrix} \sum_{i, j} w_{i j} < {\tilde{w}}_{i j}; l o s t t h e d e m a n d w_{i j} \\ \sum_{i, j} w_{i j} \geq {\tilde{w}}_{i j}; t r a n s p o r t t h e d e m a n d w_{i j} \end{matrix}

(A33)

where

{\tilde{w}}_{i j} = \frac{\sum_{i, j} w_{i j} \cdot \min_{k, l} {d_{i k} + α \cdot d_{k l} + d_{l j}} + \underset{k, l}{m i n} ({c s}_{k} + {c s}_{l}) + \underset{k, l}{m i n} (h_{k}) + \underset{k, l}{m i n} (y_{k l})}{B}

(A34)

b.: Tightening

In addition to the preprocessing techniques, we present some valid inequalities by investigating the features of the developed mathematical model. These inequalities can make the formulation a tighter one, in terms of the gap between the LP relaxation of the model and the integer polyhedral.

This argument can be further extended as an elimination bound to exclude some hub nodes. Hence, it is easy to follow that for any node i, we can state

\sum_{j} (y_{i j} + y_{j i}) = 0 ⟹ h_{i} = 0 \forall i

(A35)

we applied this by adding the following cut to the problem:

h_{i} \leq \sum_{j} (y_{i j} + y_{j i}) \forall i

(A36)

In addition, considering the characteristics of a connected graph leads us to have

\sum_{k, l} y_{k l} \geq \sum_{k} h_{k}

(A37)

We can also substitute the value of large M, as a large constant, with relevant model parameters to further tighten the formulation using

M = \sum_{i, j} w_{i j}

(A38)

References

Bhatt, S.; Sinha, A.; Jayaswal, S. The Capacitated R-Hub Interdiction Problem with Congestion: Models and Solution Approaches. Transp. Res. Part E Logist. Transp. Rev. 2024, 185, 103482. [Google Scholar] [CrossRef]
Redondo, J.L.; Fernández, J.; Arrondo, A.G.; García, I.; Ortigosa, P.M. Fixed or variable demand? Does it matter when locating a facility? Omega 2012, 40, 9–20. [Google Scholar] [CrossRef]
Contreras, I. Hub Location Problems. In Location Science; Springer: Cham, Switzerland, 2015; pp. 311–344. [Google Scholar] [CrossRef]
Pels, E. Optimality of the hub-spoke system: A review of the literature, and directions for future research. Transp. Policy 2021, 104, A1–A10. [Google Scholar] [CrossRef]
Alumur, S.; Kara, B.Y. Network hub location problems: The state of the art. Eur. J. Oper. Res. 2008, 190, 1–21. [Google Scholar] [CrossRef]
Campbell, J.F.; O’Kelly, M.E. Twenty-Five Years of Hub Location Research. Transp. Sci. 2012, 46, 153–169. [Google Scholar] [CrossRef]
Alumur, S.A.; Campbell, J.F.; Contreras, I.; Kara, B.Y.; Marianov, V.; O’Kelly, M.E. Perspectives on modeling hub location problems. Eur. J. Oper. Res. 2021, 291, 1–17. [Google Scholar] [CrossRef]
Farahani, R.Z.; Hekmatfar, M.; Arabani, A.B.; Nikbakhsh, E. Hub location problems: A review of models, classification, solution techniques, and applications. Comput. Ind. Eng. 2013, 64, 1096–1109. [Google Scholar] [CrossRef]
Campbell, J.F.; Ernst, A.T.; Krishnamoorthy, M. Hub Arc Location Problems: Part II—Formulations and Optimal Algorithms. Manag. Sci. 2005, 51, 1556–1571. [Google Scholar] [CrossRef]
Zhou, S.; Ji, B.; Song, Y.; Samson, S.Y.; Zhang, D.; Van Woensel, T. Hub-and-spoke network design for container shipping in inland waterways. Expert Syst. Appl. 2023, 223, 119850. [Google Scholar] [CrossRef]
Asgari, N.; Farahani, R.Z.; Goh, M. Network design approach for hub ports-shipping companies competition and cooperation. Transp. Res. Part A Policy Pract. 2013, 48, 1–18. [Google Scholar] [CrossRef]
Baird, A.J. Optimising the container transhipment hub location in northern Europe. J. Transp. Geogr. 2006, 14, 195–214. [Google Scholar] [CrossRef]
Chou, C.C. Application of FMCDM model to selecting the hub location in the marine transportation: A case study in southeastern Asia. Math. Comput. Model. 2010, 51, 791–801. [Google Scholar] [CrossRef]
Gelareh, S.; Pisinger, D. Fleet deployment, network design and hub location of liner shipping companies. Transp. Res. Part E Logist. Transp. Rev. 2011, 47, 947–964. [Google Scholar] [CrossRef]
Arnold, P.; Peeters, D.; Thomas, I. Modelling a rail/road intermodal transportation system. Transp. Res. Part E Logist. Transp. Rev. 2004, 40, 255–270. [Google Scholar] [CrossRef]
Walha, F.; Bekrar, A.; Chaabane, S.; Loukil, T.M. A rail-road PI-hub allocation problem: Active and reactive approaches. Comput. Ind. 2016, 81, 138–151. [Google Scholar] [CrossRef]
Jaillet, P.; Song, G.; Yu, G. Airline network design and hub location problems. Locat. Sci. 1996, 4, 195–212. [Google Scholar] [CrossRef]
Oktal, H.; Ozger, A. Hub location in air cargo transportation: A case study. J. Air Transp. Manag. 2013, 27, 1–4. [Google Scholar] [CrossRef]
Yang, T.H. Airline network design problem with different airport capacity constraints. Transportmetrica 2008, 4, 33–49. [Google Scholar] [CrossRef]
Yang, T.H. Stochastic air freight hub location and flight routes planning. Appl. Math. Model. 2009, 33, 4424–4430. [Google Scholar] [CrossRef]
Kreutzberger, E.; Konings, R. The challenge of appropriate hub terminal and hub-and-spoke network development for seaports and intermodal rail transport in Europe. Res. Transp. Bus. Manag. 2016, 19, 83–96. [Google Scholar] [CrossRef]
Perea, F.; Mesa, J.A.; Laporte, G. Adding a new station and a road link to a road–rail network in the presence of modal competition. Transp. Res. Part B Methodol. 2014, 68, 1–16. [Google Scholar] [CrossRef]
Xia, W.; Zhang, A. Air and high-speed rail transport integration on profits and welfare: Effects of air-rail connecting time. J. Air Transp. Manag. 2017, 65, 181–190. [Google Scholar] [CrossRef]
Xiao, G.; Xiao, Y.; Shu, Y.; Ni, A.; Jiang, Z. Technical and economic analysis of battery electric buses with different charging rates. Transp. Res. Part D Transp. Environ. 2024, 132, 104254. [Google Scholar] [CrossRef]
Gelareh, S.; Nickel, S. Hub location problems in transportation networks. Transp. Res. Part E Logist. Transp. Rev. 2011, 47, 1092–1111. [Google Scholar] [CrossRef]
Magnanti, T.L.; Wong, R.T. Accelerating Benders Decomposition: Algorithmic Enhancement and Model Selection Criteria. Oper. Res. 1981, 29, 464–484. [Google Scholar] [CrossRef]
Zheng, Y.; Ji, Y.; Shen, Y.; Liu, B.; Du, Y. Hub location problem considering spoke links with incentive-dependent capacities. Comput. Oper. Res. 2022, 148, 105959. [Google Scholar] [CrossRef]
Campbell, J.F. Modeling Economies of Scale in Transportation Hub Networks. In Proceedings of the 2013 46th Hawaii International Conference on System Sciences, Wailea, HI, USA, 7–10 January 2013; pp. 1154–1163. [Google Scholar] [CrossRef]
Campbell, J.F.; Miranda, G.; Camargo, R.S.; O′Kelly, M.E. Hub Location and Network Design with Fixed and Variable Costs. In Proceedings of the 2015 48th Hawaii International Conference on System Sciences, Kauai, HI, USA, 5–8 January 2015; pp. 1059–1067. [Google Scholar] [CrossRef]
Eiselt, H.A.; Marianov, V. A conditional p-hub location problem with attraction functions. Comput. Oper. Res. 2009, 36, 3128–3135. [Google Scholar] [CrossRef]
Francis, R.L.; Lowe, T.J.; Rayco, M.B.; Tamir, A. Aggregation error for location models: Survey and analysis. Ann. Oper. Res. 2009, 167, 171–208. [Google Scholar] [CrossRef]
Francis, R.L.; Lowe, T.J.; Tamir, A. Aggregation Error Bounds for a Class of Location Models. Oper. Res. 2000, 48, 294–307. [Google Scholar] [CrossRef]
Carling, K.; Håkansson, J. A compelling argument for the gravity p-median model. Eur. J. Oper. Res. 2013, 226, 658–660. [Google Scholar] [CrossRef]
Alibeyg, A.; Contreras, I.; Fernández, E. Hub network design problems with profits. Transp. Res. Part E Logist. Transp. Rev. 2016, 96, 40–59. [Google Scholar] [CrossRef]
Khosravi, S.; Akbari Jokar, M.R. Facility and hub location model based on gravity rule. Comput. Ind. Eng. 2017, 109, 28–38. [Google Scholar] [CrossRef]
Drezner, T. Derived attractiveness of shopping malls. IMA J. Manag. Math. 2006, 17, 349–358. [Google Scholar] [CrossRef]
Drezner, T.; Drezner, Z. A Note on Applying the Gravity Rule to the Airline Hub Problem. J. Reg. Sci. 2001, 41, 67–72. [Google Scholar] [CrossRef]
Drezner, T.; Drezner, Z. The gravity multiple server location problem. Comput. Oper. Res. 2011, 38, 694–701. [Google Scholar] [CrossRef]
Grove, P.G.; O’Kelly, M.E. Hub Networks and Simulated Schedule Delay. Pap. Reg. Sci. 1986, 59, 103–119. [Google Scholar] [CrossRef]
Yu, B.; Guo, Z.; Asian, S.; Wang, H.; Chen, G. Flight delay prediction for commercial air transport: A deep learning approach. Transp. Res. Part E Logist. Transp. Rev. 2019, 125, 203–221. [Google Scholar] [CrossRef]
Ebery, J.; Krishnamoorthy, M.; Ernst, A.; Boland, N. The capacitated multiple allocation hub location problem: Formulations and algorithms. Eur. J. Oper. Res. 2000, 120, 614–631. [Google Scholar] [CrossRef]
Ernst, A.T.; Krishnamoorthy, M. Solution algorithms for the capacitated single allocation hub location problem. Ann. Oper. Res. 1999, 86, 141–159. [Google Scholar] [CrossRef]
Elhedhli, S.; Hu, F.X. Hub-and-spoke network design with congestion. Comput. Oper. Res. 2005, 32, 1615–1632. [Google Scholar] [CrossRef]
de Camargo, R.S.; Miranda, G., Jr.; Ferreira RP, M.; Luna, H.P. Multiple allocation hub-and-spoke network design under hub congestion. Comput. Oper. Res. 2009, 36, 3097–3106. [Google Scholar] [CrossRef]
Najy, W.; Diabat, A. Benders decomposition for multiple-allocation hub-and-spoke network design with economies of scale and node congestion. Transp. Res. B 2020, 133, 62–84. [Google Scholar] [CrossRef]
Nickel, S.; Schöbel, A.; Sonneborn, T. Hub Location Problems in Urban Traffic Networks. In Applied Optimization. Mathematical Methods on Optimization in Transportation Systems; Springer: Boston, MA, USA, 2001; pp. 95–107. [Google Scholar] [CrossRef]
Gelareh, S.; Neamatian Monemi, R.; Nickel, S. Multi-period hub location problems in transportation. Transp. Res. Part E Logist. Transp. Rev. 2015, 75, 67–94. [Google Scholar] [CrossRef]
O’Kelly, M.E.; Bryan, D.L. Hub location with flow economies of scale. Transp. Res. Part B Methodol. 1998, 32, 605–616. [Google Scholar] [CrossRef]
de Camargo, R.S.; de Miranda, G.; Luna, H.P.L. Benders Decomposition for Hub Location Problems with Economies of Scale. Transp. Sci. 2008, 43, 86–97. [Google Scholar] [CrossRef]
Naoum-Sawaya, J.; Elhedhli, S. An interior-point Benders based branch-and-cut algorithm for mixed integer programs. Ann. Oper. Res. 2013, 210, 33–55. [Google Scholar] [CrossRef]
Rahmaniani, R.; Crainic, T.G.; Gendreau, M.; Rei, W. The Benders decomposition algorithm: A literature review. Eur. J. Oper. Res. 2017, 259, 801–817. [Google Scholar] [CrossRef]
Claus, A. A New Formulation for the Travelling Salesman Problem. SIAM J. Algebr. Discret. Methods 1984, 5, 21–25. [Google Scholar] [CrossRef]
Zarandi, M.M.F. Using Decomposition to Solve Facility Location/Fleet Managment Problems. Ph.D. Thesis, University of Toronto, Toronto, ON, Canada, 2010. Available online: https://tspace.library.utoronto.ca/handle/1807/24945 (accessed on 9 July 2024).
Zverovich, V.; Fábián, C.I.; Ellison EF, D.; Mitra, G. A computational study of a solver system for processing two-stage stochastic LPs with enhanced Benders decomposition. Math. Program. Comput. 2012, 4, 211–238. [Google Scholar] [CrossRef]
Linderoth, J.; Wright, S. Decomposition Algorithms for Stochastic Programming on a Computational Grid. Comput. Optim. Appl. 2003, 24, 207–250. [Google Scholar] [CrossRef]
Boland, N.; Krishnamoorthy, M.; Ernst, A.T.; Ebery, J. Preprocessing and cutting for multiple allocation hub location problems. Eur. J. Oper. Res. 2004, 155, 638–653. [Google Scholar] [CrossRef]
Marín, A. Formulating and solving splittable capacitated multiple allocation hub location problems. Comput. Oper. Res. 2005, 32, 3093–3109. [Google Scholar] [CrossRef]

Table 1. Solving the flow routing problem.

Step 0	initial pool ← $\emptyset$ , the selected pool ← $\emptyset$ , solution pool-2 ← $\emptyset$ solution pool-3 ← $s e t s t o p$ ← false
Step 1	for (i = 1:100) do Run 01, 02, and use to solve 03 initial pool ← initial pool $\cup$ solution (i) i ← i + 1 end for
Step 2	selected pool ← the top 25 solutions from initial pool
Step 3	for (i = 1:25) terminate ← false while (terminate = false) do apply 1-opt algorithm on solution (i) from selected pool if 1-opt modified solution (i) < better original then “solution pool-2” ← “solution pool-2” $\cup$ 1-opt modified solution (i) else “solution pool-2” ← “solution pool-2” $\cup$ solution (i) end if if there is no 1-opt modification left then terminate ← true end if end while; end for
Step 4	selected pool-2 ← the top 25 solutions from “solution pool-2”
Step 5	for (i = 1:25) terminate ← false while (terminate = false) do apply 2-opt algorithm on solution (i) from selected pool-2 if 2-opt modified solution (i) < better original then “solution pool-3” ← “solution pool-3” $\cup$ 2-opt modified solution (i) else “solution pool-3” ← “solution pool-3” $\cup$ solution (i) end if if there is no 2-opt modification left then terminate ← true end if end while; end for
Step 6	for i = 1:n for j = 1:25 s(i) ← s(i) + sj(i) h(i) ← h(i) + hj(i) for k = 1:n y(ik) ← y(ik) + yj(ik) end for end for if s(i) >= ξ% * 25 then in the final solution s(i) ← 1 end if if s(i) = 0 then in the final solution s(i) ← 0 end if if h(i) >= ξ% * 25 then in the final solution h(i) ← 1 end if if h(i) = 0 then in the final solution h(i) ← 0 end if if y(ik) >= ξ% * 25 then in the final solution y(ik) ← 1 end if if y(ik) = 0 then in the final solution y(ik) ← 0 end if end for
Step 7	fix the values of s(i), h(i), y(ik) from the previous step update the budget constraint go to Steps 0–5 (use the updated Solution pool-3)
Step 8	terminate ← false while (terminate = false) do apply the 1-opt algorithm on a solution from Solution pool-3 if 1-opt modified solution < better original then “final pool” ← “final pool” $\cup$ 1-opt for a modified solution else “final pool” ← “final pool” $\cup$ solution end if if there is no 1-opt modification left then terminate ← true end if end while
Step 9	choose the best solution from the “final pool”

Table 2. Experiments with the original model (HSLPT-01).

	α = 0.2							α = 0.5						α = 0.8
Lost Demand Cost	# of Nodes	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time	Objective Improvement	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time	Objective Improvement	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time	Objective Improvement
0.8	6	6	1	0	50,052	0:02	27%	6	1	0	50,052	0:02	15%	6	1	0	50,052	0:02	9%
	7	7	1	0	87,229	6:33	24%	7	1	0	87,229	6:13	20%	7	1	0	87,229	7:41	12%
	8	8	1	0	94,333	2:02	32%	8	1	0	94,333	1:54	24%	8	1	0	94,333	1:54	12%
	9	6	2	2	110,123	1:49:17	31%	6	2	2	129,460	1:11:32	22%	6	1	0	141,499	1:37:18	15%
	10	9	1	0	131,628	4:56:19	23%	9	1	0	149,471	5:45:50	19%	9	1	0	149,471	7:14:46	11%
0.5	6	6	1	0	50,052	0:02	23%	6	1	0	50,052	0:02	17%	6	1	0	50,052	0:02	11%
	7	7	1	0	87,229	4:24	23%	7	1	0	87,229	4:25	19%	7	1	0	87,229	4:28	18%
	8	7	1	0	93,487	1:56	25%	7	1	0	93,487	1:49	21%	7	1	0	93,487	1:53	21%
	9	6	2	2	103,771	1:32:18	28%	6	2	2	120,789	1:21:04	29%	6	1	0	125,587	1:21:45	23%
	10	7	2	2	106,750	6:24:00	24%	7	2	2	120,273	6:09:22	21%	7	2	2	134,305	7:14:34	17%
0.2	6	5	1	0	47,134	0:02	29%	5	1	0	47,134	0:02	19%	5	1	0	47,134	0:02	10%
	7	3	2	2	72,698	5:39	21%	3	2	2	78,886	5:52	22%	5	1	0	82,370	5:58	16%
	8	5	2	2	74,010	1:50	27%	5	2	2	78,338	1:38	23%	5	1	0	80,484	1:16	19%
	9	6	2	2	88,254	1:31:29	32%	6	2	2	104,633.9	1:48:58	25%	6	1	0	109,520	1:34:58	23%
	10	7	2	2	98,338	4:48:54	28%	7	2	2	108,920	4:22:03	21%	8	1	0	117,089	6:23:34	33%
0.05	6	4	1	0	31,650	0:02	18%	4	1	0	31,650	0:02	16%	4	1	0	31,650	0:02	11%
	7	2	2	2	29,972	4:13	21%	3	2	2	39,364	4:18	19%	2	1	0	45,872	3:54	19%
	8	3	2	2	35,864	1:00	24%	3	2	2	42,194	1:09	25%	3	2	2	48,525	1:08	18%
	9	3	2	2	45,492	41:20	34%	3	2	2	52,048	41:05	23%	3	2	2	56,210	1:10:34	24%
	10	5	3	6	55,758	3:45:16	29%	5	3	6	59,524	5:27:22	21%	6	2	2	65,374	7:16:24	27%

Table 3. Benders decomposition approach vs. CPLEX.

		α = 0.2							α = 0.5
Lost demand Cost	Number of Nodes	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time			Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time
Lost demand Cost	Number of Nodes	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	CPLEX	Benders	Modified Benders	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	CPLEX	Benders	Modified Benders
0.2	10	7	2	2	98,338	4:48:54	0:40	0:27	7	2	2	108,920	4:22:03	0:49	0:34
	15	9	4	12	121,743	*	4:15	3:19	9	4	12	128,928	*	5:52	4:12
	20	12	7	36	179,425	*	18:18	13:13	12	6	26	196,512	*	35:47	26:48
	25	14	8	42	213,515	*	17:39	14:34	15	8	42	244,324	*	32:38	27:05
	30	17	8	50	241,272	*	45:37	36:48	17	8	50	288,864	*	1:00:38	43:28
	35	23	10	64	371,560	*	1:24:26	59:09	23	9	64	471,715	*	1:35:35	1:12:49
	40	25	11	76	497,890	*	1:54:45	1:17:40	26	10	70	598,512	*	1:46:45	1:26:34

Table 4. Analysis on the effect of congestion factor = 2-HSLPT-02 model.

		α = 0.2					α = 0.5					α = 0.8
Lost Demand Cost	# of Nodes	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time
0.8	10	7	2	2	137,433	0:47	7	2	2	149,312	0:41	7	2	2	162,859	1:55
	15	9	6	24	214,001	7:35	9	6	24	241,892	11:34	9	6	24	270,049	23:43
	20	13	8	42	311,513	17:34	13	8	42	348,906	25:32	13	7	36	460,092	46:53
	25	16	9	56	378,992	22:35	16	8	42	540,155	36:42	16	7	36	513,475	1:14:52
	30	18	9	56	426,821	1:07:35	18	9	52	604,484	1:34:21	18	8	42	623,230	2:17:34
	35	23	11	72	711,848	1:59:41	23	11	72	943,419	2:54:17	23	9	52	1,075,409	3:51:17
	40	25	10	52	949,615	3:17:46	25	9	52	1,236,130	3:56:19	24	9	52	1,408,239	5:02:20
0.5	10	7	2	2	109,265	0:32	7	2	2	120,901	0:35	7	2	2	131,795	1:23
	15	10	6	24	168,786	8:34	10	6	24	190,782	14:36	9	5	16	205,256	25:45
	20	13	8	52	236,946	18:32	13	7	38	265,392	30:25	13	6	24	349,731	44:06
	25	15	7	34	296,551	24:41	16	7	34	391,786	48:42	16	7	34	408,236	1:31:15
	30	17	9	56	342,341	1:42:19	17	9	52	445,194	1:54:56	16	8	40	459,000	2:34:41
	35	23	10	60	520,897	2:49:12	21	11	72	674,985	3:17:54	20	9	56	811,262	4:53:17
	40	26	12	108	694,940	3:45:18	25	11	72	911,150	4:38:22	25	10	60	998,772	6:30:27
0.2	10	7	2	2	98,338	0:58	7	2	2	108,920	1:07	8	1	0	117,089	1:01
	15	9	5	16	147,311	6:47	8	5	16	159,304	13:34	8	5	16	168,990	27:35
	20	12	6	24	209,689	21:24	12	6	30	231,662	27:30	12	6	30	299,458	48:18
	25	14	7	34	240,219	32:49	15	7	34	317,320	37:35	15	7	34	330,690	1:41:34
	30	17	9	56	288,263	1:19:51	17	8	52	371,993	1:59:45	15	7	34	385,786	3:51:23
	35	23	10	60	429,252	2:17:47	22	9	52	546,104	3:09:16	20	9	52	651,302	5:31:46
	40	25	12	108	558,023	3:13:24	24	10	68	731,501	4:49:03	24	9	52	794,758	7:29:18
0.05	10	5	2	2	58,538	0:35	5	2	2	65,027	0:34	5	2	2	71,170	0:37
	15	7	4	10	93,300	8:24	7	4	10	91,031	15:46	7	4	10	107,098	26:34
	20	11	7	34	120,951	20:52	10	6	34	135,113	29:41	10	5	20	176,209	54:33
	25	14	7	34	134,217	35:43	14	6	24	181,659	39:43	14	6	22	192,748	1:47:34
	30	15	7	34	154,433	1:14:54	15	7	32	218,988	1:54:51	15	6	24	225,150	3:18:34
	35	22	9	54	232,281	2:41:20	20	9	54	304,084	3:03:17	18	9	54	366,807	4:48:30
	40	23	12	96	301,814	3:39:26	22	11	76	395,898	4:23:10	22	10	68	442,619	6:25:40

Table 5. Results with flow-based cost function (HSLTP-03).

		α = 0.2						α = 0.5
Lost Demand Cost	# of Nodes	Spoke Nodes	Hub Nodes	Hub Links	Total Cost	Run Time	Total Cost	Run Time	Objective Improvement
0.2	10	7	2	2	89,446	0:46	108,920	0:34	18%
	15	9	4	12	108,161	5:29	128,928	4:12	16%
	20	12	7	32	164,332	29:15	196,512	26:48	16%
	25	14	8	42	205,178	32:47	244,324	27:05	16%
	30	18	8	50	230,906	51:20	288,864	43:28	20%
	35	23	10	64	397,384	1:35:04	471,715	1:12:49	16%
	40	25	10	72	481,768	1:48:19	598,512	1:26:34	20%

Table 6. Results of the heuristic algorithm.

		α = 0.2					α = 0.5
Lost Demand Cost	# of Nodes	Total Cost			Run time		Total Cost			Run time
Lost Demand Cost	# of Nodes	Heuristic	Modified Benders	Gap %	Heuristic	Modified Benders	Heuristic	Modified Benders	Gap %	Heuristic	Modified Benders
0.2	10	111,122	98,338	13%	0:22	0:27	124,169	108,920	14%	0:30	0:34
	15	132,700	121,743	9%	1:47	3:19	146,978	128,928	14%	2:46	4:12
	20	186,602	179,425	4%	9:26	13:13	206,338	196,512	5%	16:03	26:48
	25	232,731	213,515	9%	11:13	14:34	263,870	244,324	8%	18:25	27:05
	30	255,748	241,272	6%	20:54	36:48	309,084	288,864	7%	24:20	43:28
	35	390,138	371,560	5%	34:46	59:09	518,887	471,715	10%	39:05	1:12:49
	40	532,742	497,890	7%	44:19	1:17:40	628,438	598,512	5%	47:19	1:26:34
	50	622,221	*	*	1:25:19	*	685,516	*	*	1:30:18	*
	60	748,105	*	*	1:53:20	*	775,384	*	*	1:54:29	*
	70	761,774	*	*	2:29:16	*	919,331	*	*	2:38:52	*
	85	801,585	*	*	2:54:37	*	912,481	*	*	3:01:25	*

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khosravi, S.; Bozorgi, A.; Zahedi-Seresht, M. Hub-and-Spoke Network Design Considering Congestion and Flow-Based Cost Function. Appl. Sci. 2024, 14, 6416. https://doi.org/10.3390/app14156416

AMA Style

Khosravi S, Bozorgi A, Zahedi-Seresht M. Hub-and-Spoke Network Design Considering Congestion and Flow-Based Cost Function. Applied Sciences. 2024; 14(15):6416. https://doi.org/10.3390/app14156416

Chicago/Turabian Style

Khosravi, Shahrzad, Ali Bozorgi, and Mazyar Zahedi-Seresht. 2024. "Hub-and-Spoke Network Design Considering Congestion and Flow-Based Cost Function" Applied Sciences 14, no. 15: 6416. https://doi.org/10.3390/app14156416

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Hub-and-Spoke Network Design Considering Congestion and Flow-Based Cost Function

Abstract

1. Introduction

2. Literature Review

2.1. Hub-and-Spoke Transportation Network Design

2.2. Utility-Based Demand Calculation for Hub Location Problems

2.3. Contribution of This Work

3. Mathematical Formulation of the Model

3.1. Formal Definition and Modeling Assumptions

3.2. Mathematical Model

3.3. Model with Balanced Congestion

3.4. Model with Flow-Based Economies of Scale Inter-Hub Transportation Cost

4. Solution Approach

4.1. Master Problem

4.2. Sub-Problem

4.3. Dual of the Sub-Problem

4.4. Using E-Optimality Technique

4.5. Using Warm-Start Technique

5. Heuristic Algorithm

5.1. Fixing Hub-and-Spoke Locations

5.2. Establishing Hub Links

5.3. Solving the Flow Routing Problem

6. Experiments and Numerical Results

6.1. Base Model Experiments

6.2. Benders Decomposition

6.3. Effect of Congestion

6.4. Effect of Flow-Based Cost Function on the Hub Links

6.5. Heuristic Algorithm Results

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. Linearization of Nonlinear Constraints

Appendix B. Tightening the Formulation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI