1. Introduction
Preliminary results of this work have been presented at IEEE CCNC 2019 [
1].
Software defined networking (SDN) decouples the control and data planes, transferring the control logic to the SDN controllers and leaving only the forwarding actions to the network equipment. The network devices can be switches, computers or sensors, which have to forward packets between each other according to a controller defined strategy. At first, SDN architecture relied on a single controller communicating with all devices, named SDN switches. However, this approach is not scalable and was soon outplaced by an advanced design exploiting more than one controller. According to this design, the load of the controller-to-switch (Ctr–Sw) traffic is shared between the controllers; however, at the expense of extra traffic for the controller-to-controller (Ctr–Ctr) communication. The inter-controller traffic is necessary for the synchronization of the controllers.
IoT networks are dynamic and susceptible to environmental changes; thus, SDN complements IoT with its adjustable nature. Nevertheless, the volume of the SDN control traffic is critical for efficient IoT operation, since it affects the total energy consumption [
2] and reduces the available bandwidth for the data traffic. In general, IoT networks suffer from limited energy and communication facilities, since they mostly rely on battery-powered wireless sensors. Thus, the minimization of the control traffic is very important for SDN-based IoT networks. This objective is significantly affected by the controller
placement, which is the selection of the IoT nodes hosting SDN controllers, apart from being SDN switches themselves.
Starting from [
3], a substantial amount of work has already been devoted to the research on the controller placement problem, considering a wide variety of objectives. The two questions typically asked are: (i) How many controllers are required? (ii) Where should they be placed? Depending on the objective, the closely related problem of selecting the controller to assign a switch to (the switch
assignment problem) might also be non-trivial. Although the existing literature mainly proposes controller placements that minimize the time delays for the
Ctr–Sw traffic, this paper considers the controller placement effect on the total control traffic (both
Ctr–Ctr and
Ctr–Sw) and aims at its minimization. Additionally, from an architectural point of view, the primary concern of SDN-based IoT should be whether its controller placement is distributed or centralized, since it significantly impacts the network performance in terms of energy consumption, scalability and reliability [
4].
The take-home message of this paper is that as the number of controllers increases and the placement becomes more distributed, the controllers get closer to the switches and the volume of the
Ctr–Sw traffic decreases. On the other hand, if the controllers are fewer and more concentrated, then the volume of the
Ctr–Ctr traffic decreases. To exemplify those points,
Figure 1 shows in the middle an IoT network with six sensors behaving as SDN switches. The left controller placement is more centralized, using only two controllers collocated with switches 1 and 2, while the right placement is more distributed, using three controllers placed at switches
and 3. In both left and right placements, all switches have to communicate with one of the 2 or 3 controllers through 4 or 3
Ctr–Sw (blue) channels, while the controllers need 1 or 3
Ctr–Ctr (red) channels respectively. The left centralized placement features less
Ctr–Ctr but more
Ctr–Sw traffic compared to the right distributed placement, while the placement that minimizes the total control traffic depends on the
per-unit-load of these two types of traffic, which are their minimum values in the simplest case of a
Ctr–Sw or
Ctr–Ctr channel.
The fourfold contributions of this paper are to:
- (i)
Model the controller placement and switch assignment problem using integer quadratic programming (IQP) with the objective to minimize the required bandwidth for the total control traffic;
- (ii)
Propose and evaluate a set of heuristic algorithms that expedite the aforementioned problem solution, since the IQP complexity does not scale well with the network size;
- (iii)
Compare the performance of the optimal (given by IQP) and heuristic solutions, using network topologies given by the Internet Topology Zoo collection [
5] (considering the graphs of this reference point as IoT network topologies);
- (iv)
Provide testbed measurements for estimating the volume of both control traffic types and their per-unit-load.
In our model, the
Ctr–Sw (southbound) and
Ctr–Ctr (east-west) protocols are OpenFlow and Raft [
6,
7] respectively, which are used by the state-of-the-art SDN controllers [
8], such as OpenDaylight [
9] and ONOS [
10].
The remainder of this paper is organized as follows.
Section 2 introduces related work. We present the system model and problem statement in
Section 3, followed by an analysis of its optimal solution in
Section 4. In
Section 5, we devise a simple heuristic to expedite the problem solution, while in
Section 6, we build on this and propose two iterative heuristic algorithms.
Section 7 presents our experimentation results in the NITOS testbed. In
Section 8, we discuss some aspects and design choices in our analysis, and provide directions for future work. The final
Section 9 concludes the paper.
2. Related Work
The controller placement problem is introduced by Heller in [
3], narrowing its focus to two questions: given a network topology of SDN switches,
how many controllers are needed and
where in the topology should they be placed? Multiple optimization goals differentiate the answers to these questions; e.g., the minimization of the average or the worst
Ctr–Sw delay is achieved by placing the controllers according to the solution of the minimum k-median or k-center problem respectively. These optimization problems are NP-hard; thus, heuristics are exploited, such as the k-medoids algorithm [
11]. The optimized criteria can also be related to the minimization of the controller load imbalance, such as the capacitated controller placement problem [
12], which also considers the controller capacity and is formulated as a mixed integer linear programming (MILP) problem.
In [
8], new optimized criteria are introduced by also considering the
Ctr–Ctr traffic, except for the
Ctr–Sw one. The goal is to minimize the reaction time perceived at the switches, having in mind that it also depends on the
Ctr–Ctr delays, besides the
Ctr–Sw ones. A joint study of the
Ctr–Sw and
Ctr–Ctr traffic overhead costs is presented in [
13], minimizing the weighted sum of these two costs and two extra ones. The problem is formulated as an integer linear programming (ILP) problem and two heuristics are presented. An interesting aspect in [
13] is that the dynamicity of decisions is taken into account, and switch reassignment cost is explicitly included in the model. Reassignment cost has also been considered in [
14] in a virtual evolved packet core (vEPC) setting, where relocation frequency of a vEPC is among the metrics to minimize. In [
11], Pareto-optimal placements are derived aiming to solve a multi-objective optimization, where one of the objectives is the minimization of the control traffic. To this end, a MATLAB framework is developed, capable of producing both an exact solution with exhaustive search and an approximate solution using Pareto simulated annealing. In [
15], the learning automaton (LA)-based heuristic algorithm is introduced for controller placement. In [
16], the objective of minimizing the overhead of software defined measurements is considered; a related IQP problem is formulated assuming fixed
Ctr–Ctr costs; and an approximation algorithm with a fixed approximation ratio is devised. In [
17], an approximation algorithm with a guaranteed performance bound is derived for a model similar to the one we consider in this paper; however, the proof requires that the per-unit-load of the
Ctr–Ctr traffic does not depend on the number of flows installed in the switches, which is not the case in practice.
This is the enhanced extension of first study [
1] of the controller placement and switch assignment problem, focusing exclusively on the minimization of the required bandwidth for the total control traffic. We focus exclusively on this optimization objective, since we want to analyze the effects of the contradictory tendencies of either centralizing or distributing the control plane, on the volume of the control traffic, as it is outlined in [
4]. Differently to our previous study [
1], we introduce two additional heuristic algorithms, with enhanced performance, at a cost of slightly more computation time (more details in
Section 6). The new heuristics build upon the base heuristic presented in our previous work and apply a local search approach to discover even better placements. Moreover, we have added results for a modified version of the optimization problem with an additional constraint, which are particularly helpful as a benchmark for our heuristic estimating the number of needed controllers. Finally, a new section has been added, where we discuss some of the design choices in this paper and provide hints for further research on the topic, such as how to use our heuristic methods for minimizing both the total control traffic and the average
Ctr–Sw delay.
3. System Model and Problem Statement
Let us assume an SDN-based IoT network, represented by a network graph , where is the set of S switches (or sensors) and is the set of the L network links between them. Without loss of generality, we assume that the control traffic is routed through the shortest path, which is for connecting switches , and the number of links included in this path is . is the subset of switches where controllers are placed. From now on, we may refer to as a controller or the switch hosting it, interchangeably. Let denote the controller that switch is assigned to. Vector describes a controller placement and switch assignment, where each vector coordinate maps to a switch and the vector value indicates the corresponding controller . The controller placement is given by the set of the vector values.
Our goal is to minimize the volume of the total control traffic, which is the sum of the
Ctr–Sw and
Ctr–Ctr traffic. The aggregated required bandwidth for the
Ctr–Sw traffic from all network links is
where
, and
is the bandwidth required for the
Ctr–Sw traffic between switch
s and controller
. The southbound protocol dependent
are independent of the controller placement; thus,
decreases with , which happens with many distributed controllers close to the switches. On the other hand, the corresponding aggregate required bandwidth for the
Ctr–Ctr traffic is
where
(
and
), and
is the bandwidth required for the
Ctr–Ctr traffic initiated from
and sent to
. The east-west protocol dependent
are independent of the controller placement; thus,
decreases with , which happens with few centralized controllers, one close to the other.
In this work, we study the optimal controller placement and switch assignment
for minimum control traffic, where
gives the optimal placement of
controllers. The solution of this problem is defined as
Note that we can alternatively choose the weights w to reflect the per goodput byte consumed energy for a transmission along the path between the switch (or controller) and the (other) controller. In this way, we seek to minimize the total energy consumed for the control traffic. The mathematical formulation is exactly the same in both cases, so throughout this paper we consider the weights to represent shortest path lengths for simplicity.
Finally, we make the following remarks, identifying the per-unit-load of both types of traffic. More specifically, we assume (and validate later in our experimentation, presented in
Section 7) that:
Remark 1. The required bandwidth for the Ctr–Sw traffic exchanged between a switch and its controller is proportional to the number of flows existing in this switch.
Remark 2. The required bandwidth for the Ctr–Ctr traffic exchanged between two controllers and initiated from one of these two is proportional to the number of switches assigned to this controller.
According to Remark 1, if
is the required bandwidth for a flow, then
, where
is the number of flows existing in switch
s. We also assume that
denotes the average number of flows per switch. Moreover, in line with Remark 2, if
is the volume of the
Ctr–Ctr traffic for each assigned switch, then
, where
denotes the number of switches assigned to controller
c. As follows, the problem of Equation (
3) can be rewritten as
4. Problem Solution
4.1. Insights from the Closed form Solution for Mesh Networks
Let us consider a mesh network where all switches have the same number
f of flows and there is a link between each pair of switches. We search for the optimal controller placement and switch assignment; that is, the solution to the problem of Equation (
4). Because of the mesh network symmetry, only the number of controllers effects the solution efficiency and not their placement. Thus, after finding the optimal size
of the controller set, their placement can be done randomly. In addition, the switches can be randomly assigned to the controllers, keeping in mind that each controller must control at least the switch that it is collocated with.
From Equation (
1) and Remark 1, we have
Moreover, each controller
c is one-hop away from the other controllers, and it is responsible for
switches (
,
and
are equivalent to
,
and
respectively). Thus,
, since all controllers are responsible for all switches. As follows, from Equation (
2) and Remark 2, we have
The number
minimizing the sum
is equal to
This is a toy example that clearly presents an outcome of this study; namely, the relation between , the fraction and the network size S. The optimal number of controllers increases with the Ctr–Sw traffic () and decreases with the Ctr–Ctr traffic () and the network size S. Next, we formulate the same problem for various topologies and examine optimal and heuristic solutions.
4.2. Integer Quadratic Programming (IQP) for Optimal Solution
Let us examine the case of a general network topology , where, similar to our toy example, all switches have the same number of flows f. We make the simplifying assumption that all switches feature the same number of flows, as a first step to approach an otherwise fairly complicated problem. This is the major compromise we make in this work, in our attempt to comprehend the nature of the problem. We also assume that both the number of flows per switch f and the network topology remain constant for the interval where the resulting placement and assignment will be applied.
The optimization problem of Equation (
4) is equivalent to the following IQP problem
where
i and
j take integer values from 1 to
S, and each value corresponds to a switch
. If
, then switch
is assigned to controller
, which is collocated with switch
. Non-negative integer
is the number
of switches assigned to controller
. Binary
if and only if a controller is placed at switch
. Finally,
is the length of the path connecting switch
(or controller
) and controller
.
In Equation (
8), the minimized sum consists of two terms; the first corresponds to the
Ctr–Sw traffic (
) and the second one to the
Ctr–Ctr traffic (
). The first term is the sum of the lengths of the paths connecting all switches to their controllers, multiplied by
. The second term is the sum over all controller pairs of the products between the length of the path connecting them and the number of switches assigned to one of them, scaled by
. The first constraint restricts each switch to be assigned to only one controller, while the second and third constraint guarantee that
and
have the aforementioned meaning. Especially for the third constraint, binary variable
has to be minimized; thus,
, if
; otherwise,
, since
.
The optimal controller placement given by the solution of Problem (
8)–(
9) is
and the switch assignment is
. Given the symmetric matrix
induced by
, the objective function of Equation (
8) is equivalent to
, where
, and . The vectors and are composed of all the rows of the matrixes and respectively. The superscript indices next to the matrix symbol give the matrix dimensions, while the matrix is full of zeros.
It is not hard to show that this problem is a generalization of the well-studied facility location problem, which is NP-Hard, as already observed in [
17]. For large network instances, using IQP to solve the problem might take a prohibitively long time, especially considering that in a dynamic environment the solution is of value only for as long as the network topology
and the average number of flows
f stay constant. In the following two
Section 5 and
Section 6, we present some heuristics which trade off optimality with more reasonable computation times.
6. Iterative Heuristics
Although the performance of the previous heuristic is in general satisfactory, examining a single value is obviously an extreme scenario for a more generic family of iterative heuristic solutions. We propose here two deterministic iterative algorithms examining multiple candidate placements based on a suitably defined local search procedure.
6.1. Local Search Algorithm with a Fixed Number of Controllers
A local search algorithm constitutes moving along a trajectory of feasible solutions, where each solution is a neighbor to the immediate previous, yielding iteratively lower costs until we reach a point where no neighboring solution exists that further lowers the cost. To move to a new solution, all neighbor solutions of the current one are examined, and the one yielding the largest decrease is selected. The obvious drawback of this algorithm is that it generally saturates at some local optimum, which might not be a global one.
A key design decision in a local search algorithm is to define what constitutes a neighboring solution. There is an underlying trade-off. If the neighborhood size is too large, a lot of time is consumed by evaluating all the neighbor solutions before each move. In the limit as the neighborhood is expanded, this approaches exhaustive search. On the other hand, if the neighborhood is too small, it is easier to get trapped in a local optimum.
In our context, we chose a simple notion of neighborhood. In particular, we consider placement to be a neighbor of placement , if and only if is derived from by moving exactly one controller to an adjacent switch not currently hosting another controller. Through this definition, we manage to always keep the number of controllers constant. We call the algorithm performing a local search starting from any initial placement local-search-fixed, because we keep the number of controllers invariant throughout it. Its detailed operation is presented in Algorithm 1.
One first generalization of the simple heuristic of the previous section is thus to start with the placement
and plug it in the
local-search-fixed algorithm, which will then perform a local search until saturation; that is, until there is no move to a neighboring placement that lowers the cost.
Algorithm 1 Local-search algorithm with fixed number of controllers (local-search-fixed). |
- 1:
input:, , , f - 2:
- 3:
- 4:
- 5:
(solution of Prob. ( 12)–( 13)) - 6:
- 7:
repeat - 8:
- 9:
- 10:
- 11:
for do - 12:
- 13:
- 14:
- 15:
for do - 16:
- 17:
- 18:
- 19:
- 20:
- 21:
if then - 22:
- 23:
- 24:
- 25:
- 26:
- 27:
end if - 28:
- 29:
end for - 30:
- 31:
end for - 32:
- 33:
until - 34:
- 35:
return,
|
6.2. Local Search Algorithm with Variable Number of Controllers
A second generalization comes from the observation that the number of controllers derived from the regression is an estimate which does not coincide with the optimal number in many cases, but it does not generally differ from it by more than a few controllers. A natural idea is then to apply local search not only starting from the nodes, but to also probe lower and higher numbers of controllers. For each number, we use the betweenness centrality rank of nodes to select the initial placement, and subsequently apply the local-search-fixed algorithm. First, we keep reducing the number of controllers, one at a time, until we observe an increase in the returned cost. Similarly, we then increase the number of controllers from and upwards until an increase of cost is observed. When lowered cost values have been observed for both lower and higher than numbers of controllers, we just select the placement that yields the lowest cost. We call this algorithm local-search-variable. The related pseudocode is presented in Algorithm 2.
Note that we still examine higher numbers, even if we have found the cost to decrease with numbers lower than
. Indeed, there is no guarantee of monotonicity of the output of the algorithm, something we have also observed on several occasions in practice. This means that, for example, the returned cost for 10 controllers might be higher than the returned cost for both 9 and 11 controllers. A consequence of this is that we cannot be certain that we could not have encountered even lower cost values if we ignored the cost increases at the points we stopped, and continued increasing or decreasing controller cardinality. A natural question then arises, whether applying
local-search-fixed for
all possible numbers of controllers is worth the extra computation time. In our simulations, we found that this approach very rarely produced any improvement at all over local-search-variable, while often consuming significantly more time. The regression based estimation of
pays off, in saving us from valuable computation time.
Algorithm 2 Local-search algorithm with variable number of controllers (local-search-variable). |
- 1:
- 2:
input:, , , f, betweenness-based order of S - 3:
- 4:
- 5:
- 6:
- 7:
- 8:
- 9:
- 10:
- 11:
- 12:
repeat - 13:
- 14:
- 15:
- 16:
- 17:
- 18:
- 19:
- 20:
- 21:
- 22:
until - 23:
- 24:
- 25:
- 26:
- 27:
- 28:
- 29:
- 30:
- 31:
- 32:
- 33:
- 34:
repeat - 35:
- 36:
- 37:
- 38:
- 39:
- 40:
- 41:
- 42:
- 43:
- 44:
until - 45:
- 46:
ifthen - 47:
- 48:
- 49:
- 50:
- 51:
- 52:
end if - 53:
- 54:
return
|
6.3. Evaluation of Iterative Heuristics
In
Figure 6a we can see how the two proposed iterative heuristics compare with our initial heuristic in terms of cost increase from the optimal cost value. We have clustered our results in five groups according to the number of switches of the respective topologies. While network size alone is not sufficient for characterizing the difficulty of the minimum traffic problem in hand, it provides a coarse measure for estimating it. Similar to the previous section, the percentage increase is the average over all integer values of
between 1 and the value that both our heuristic method and the optimal solution place controllers at all network switches.
While the median value of cost increase does not surpass 5% even for our simple initial heuristic, the importance of applying an iterative heuristic algorithm becomes greater when we look at worst case cost increases. Indeed, while in the worst case the cost increase with the initial heuristic can surpass 10%, the respective worst case increase for local-search with a fixed number of controllers is no more than 6%, and for a variable number of controllers just a little over 2%.
The results of the time required to obtain these values are depicted in
Figure 6b. For topologies with a small number of switches, we observe that most of the time the optimal solution can be found even faster than the iterative heuristics, as the problem at hand is relatively easy. As the size and complexity of the graph grows, however, greater and greater time savings can be obtained by the heuristic solutions.
8. Discussion and Future Work
In our search for efficient heuristics, we also experimented with some meta-heuristic approaches. One of them consisted of repeating the
local-search-fixed algorithm multiple times, initiated by random initial placements. This is a typical approach to combat convergence of local search algorithms at local optima. Another meta-heuristic we examined is simulated annealing [
26], which, at every step, decides whether to move to a neighbor solution chosen at random, with a probability that depends on the cost difference between the current and the neighbor solution and a control parameter called temperature which decreases with time. We have found both of these meta-heuristics to require substantially more time to reach the same performance as our proposed heuristics, which is explained by the fact that they do not use problem-specific information.
The selection of the betweenness centrality metric as a common element of our heuristic algorithms was made on the basis of its simplicity and the fact that it is calculated in very short time. As part of future work, we plan to extend our research by using other centrality metrics, apart from betweenness.
Even though the paper focused on the problem of minimizing the control traffic volume, the analysis can be easily extended with a bias for short paths between switches and controllers. Path hop counts are usually the major factor affecting latency in dense networks with negligible propagation delays. A goal of minimizing both the total control traffic and the average
Ctr–Sw delay could be expressed as minimizing a weighted sum of the two quantities, which is the following
where
is a parameter controlling the relative weight of the average
Ctr–Sw hop count minimization in the objective. From the above formulation, it is evident that our results in this paper can be applied directly just by modifying the ratio
with
.